G. YE. SHILOV 


MATHEMATICAL 
ANALYSIS 


A Special Course 


TRANSLATED BY 


J.D. DAVIS 


ENGLISH TRANSLATION EDITED 
BY 


D.A.R.WALLACE 


Department of Mathematics 
The University of Glasgow 


PERGAMON PRESS 


OXFORD - LONDON - EDINBURGH - NEW YORK 
PARIS « FRANKFURT 


Pergamon Press Ltd., Headington Hill Hall, Oxford 
4 & 5 Fitzroy Square, London W.1 


Pergamon Press (Seotland) Ltd., 2 & 3 Teviot Place, Edinburgh 1 
Pergamon Press Inc., 122 East 55th St., New York 22, N.Y. 


Pergamon Press GmbH.., Kaiserstrasse 75, Frankfurt-am-Main 


Copyright © 1965 
Pergamon Press Ltd. 


First edition 1965 


Library of Congress Catalog Card No. 65-18619 


A translation of the original volume 
Matematuieckuii analus, cheunanbuntii Kype 
(Matematicheskii Analiz, Spetsial’nyi Kurs) 

Fizmatgiz, Moscow 1961 


PREFATORY NOTE 


Tue book was written as a textbook for University Faculties of 
Mathematics, in a special course of Mathematical analysis. Prob- 
lems in the theory of functions of a real variable, the calculus of 
variations and integral equations are treated here from the single 
viewpoint of the theory of linear spaces. The reader requires a 
grasp of a general course of mathematical analysis as covered in 
a degree course. 


FOREWORD 


Tuts book has been written as a textbook for a special course of 
mathematical analysis (in brief, ‘‘ Analysis ITI”). ‘‘ Analysis IIL” is 
delivered as the third course of the mechanics—mathematics faculty 
of the Moscow State University, and dates from 1949; the intro- 
duction of such a course was initiated by academician A. N. Kolmo- 
gorov, who was the first lecturer. “ Analysis III”’ is based on material 
from earlier separate courses on the theory of functions of a real 
variable, the calculus of variations and integral equations, and 
treats all this material from a single viewpoint which has its source 
in the theory of linear spaces. 

The layout of the book is according to the following scheme. The 
first chapter gives a straightforward minimal account of set 
theory. The second chapter contains elements of the theory of 
metric and normed linear spaces. In the third chapter the calculus 
of variations is developed; it is presented here as a theory of 
differentiable functionals in normed linear spaces. The fourth 
chapter is devoted to the theory of the Lebesgue integral; the 
scheme given by F. Riesz is thought to be a better starting-point 
for the account than that of Lebesgue, inasmuch as it is more 
economical and leads more rapidly to the heart of the subject. 
The fifth chapter, ‘The Geometry of Hilbert Space”’ contains the 
theory of orthogonal resolutions and a geometrical treatment of 
integral equations. In the sixth chapter the relation between inte- 
gration and differentiation is explained and the Stieltjes integral is 
constructed. In the seventh and last chapter an account is given 
of the theory of the Fourier Transform; we depart from tradition 
here in including some material, which, in view of its special im- 
portance in mathematical physics, should long ago have had a 
place in a course of mathematical analysis. To facilitate modi- 
fications in a course of lectures, the material of the last three 
chapters is widely diversified. 

The logical dependence between chapters is given schematically 
as follows: 


la* xi 


xii FOREWORD 


Sets 
Metric spaces 
Lebesgue integral Calculus of variations 
Hilbert space Integration and differentiation 
Fourier transform 


It must be observed that the general viewpoint of functional 
analysis developed in this course does not constitute an end in 
itself, but only a means; the chief aim is an introduction to the 
field of classical mathematical analysis. 

For the sake of brevity we limit ourselves within each theme to 
a discussion of only the most important questions, fully recognising 
that the reader may well be dissatisfied in some instances. The 
choice of material generally, and in particular for the last chapters, 
presented the author with great difficulties. Some interesting but 
somewhat digressional questions have been set as problems; they 
can be used as material for seminars. 

The reader needs to be acquainted with a general course of 
mathematical analysis, such as “A short course” by A. Ya. Klin- 
chin for example. The book can then be used for an independent 
study of the subject. Towards the end of the book the elementary 
properties of analytic functions are also assumed to be familiar. 

The author is particularly grateful to M. G. Krain, O. A. Oiyanik, 
and D, A. Rykov, who have read through the manuscript and with 
their criticisms have contributed greatly to its improvement. In 
the second edition the text has been revised, supplemented, and 
improved in places. Some new problems have also been added. The 
author is indebted to his many correspondents in Leningrad, 
Kazan, Baku, and other cities of the U.S.S.R. for their valuable 
criticisms of the first edition. 


CHAPTER I 


SETS 


1. Srts, SussEts, IncLusIons 


When we consider a number of objects (‘“‘elements”’). we use 
such terms as ‘‘totality”’, “aggregate”, and “‘set”. For example, 
one can speak of the set of students in an audience, the set of 
grains of sand on a beach, the set of vertices of a polygon, or the 
set of its sides. The examples specified have the property that in 
each of them the corresponding set is composed of a definite 
number of elements (which may, in some cases, be difficult to 
determine). We shall call such sets finite. 

In mathematics it is often necessary to deal with sets which are 
not composed of a finite number of objects; the simplest examples 
are provided by the set of all natural numbers 1, 2,3, ... and the 
set of all points of an interval. We shall call such sets infinite. We 
also add to the totality of sets the empty set—the set which con- 
tains no elements at all. 

So, for example, as can be seen from Fig. 1, the set of real roots 
of the equation sin z/x = 6 is infinite when 6 = 0 (in this case it 
consists in all the values x = +2, +2z ...), finite but non-empty 


b> 


when 0< |b| <1 (the precise number of roots can be calculated 
for each 6), and empty when {b| > 1 (no value of the function 
sin a/x exceeds 1 in absolute value, so the equation sin z/x = b 
has no roots at all when [b| > 1). 

1 


2 MATHEMATICAL ANALYSIS 


As a rule we shall denote sets by capital letters A, B,C, ..., 
and their elements by small ones. The notation a € A (or A 3 a) 
indicates that a is an element of the set A; a ¢ A (or A a) means 
that a is not an element of the set A. The notation A cB (or 
BA) indicates that each element of the set A is an element of 
the set B; in this case the set A is called a subset of the set B. The 
largest subset of the set B is evidently the set B itself; the smallest 
is the empty set. Any other subset of B is known to contain some, 
but not all, of the elements of B. Each such subset is called a 
proper subset. The symbols €, 3, <, > are called inclusion signs. 
If the inclusions A < B, Bc A both hold, then each element of 
the set A is an element of the set B, while conversely each element 
of B is an element of A; thus in this case A and B consist of the 
same elements, that is, they coincide with one another. This is 
denoted by the equation 


A= B., 


Sets can be formally denoted in the following distinct ways. 
The simplest consists in enumerating all the elements of a 
set, eg. A= {I, 2, ..., 0, “acy An alternative form frequently 
used appeals to the properties of the elements of a set; thus, 
A= {x: 22 —-1<0O}is the set of all x for which the inequality 
following the colon is satisfied. 


2. OPERATIONS ON SETS 


We consider here three simple operations which can be carried 
out on sets: union, intersection, and forming the complement. 

We describe first the operation of forming the union of sets. 
Let the sets A, B,C, ... be given. We consider the totality of all 
elements, each of which belongs to at least one of A, B, C, ... This 
ageregate is a new set, which is termed the union of the sets 
A,B,C... 


Thus the union of the set 
A = (6, 7, 8, ...} 
(all the natural numbers greater than 5) and the set 
B = {8, 6,9, ...} 


SETS 3 


(all the natural numbers that are multiples of 3) is the set 
S = {3, 6,7, 8, 9, 10, ...} 
(all the natural numbers with the exception of 1, 2, 4, and 5). 


We introduce now the operation of forming the intersection of 
sets. The intersection of the sets A,B,C, ... is defined as the 
totality of elements belonging to each of the specified sets. 


Thus, in the preceding example, the intersection of the sets 


A = {6,7, 8,9, 10, ...} 

B = {3, 6, 9, 12, ...} 
is the set 

D = {6,9, 12, ...}. 


It may happen that the sets A, B, C, ... do not have a single 
element in common. In that case their intersection is the empty 
set, and the sets A, B,C, ... are said to be non-intersecting or 
disjoint. For example, the sets of integers 


A={l,2}, B= {2,3}, C= {1,3} 


are disjoint (although when taken in pairs they have common 
elements). 

Unions and intersections can be formed not only for finite 
collections of sets but also for infinite ones. For example, it is 
possible to form the union of the sets of points of all the straight 
lines in the plane that pass through a given point O. This union 
will obviously be the set of all points in the plane. The inter- 
section of the specified sets will consist in the single point O. 

The union S of sets A, B, C, ... is sometimes called their sum 
and is written in the form S = A + B+ C + ...; the intersection 
D is also termed the product and is denoted by D = ABC ... 
There is some motivation for such ‘‘arithmetical’” terminology. 
For instance, for any three sets A, B, C the equation 


(44+ B)C=AC+ BC 


holds. We shall give a proof of this equality as a simple but typical 
example of set-theoretic reasoning. 

As we remarked, two sets are considered equal if each element 
of one of them is at the same time an element of the other. Thus 


4 MATHEMATICAL ANALYSIS 


we have to show that each element x of (A + B)C (the left-hand 
side) is an element of AC + BC (the right-hand side) and con- 
versely that each element y of AC + BCiscontainedin(A + B)C. 
First let « belong to (A + B)C. Being an element of the inter- 
section of the sets A + B and C, it must belong to each of them; 
thus we have 

xE€At+B and «ec. 


Since 2 is contained in the union of A and B, it is certainly con- 
tained in one of them, say in A. But the inclusions x € A, x € C, 
imply « € A C, whence x € AC + BC. And if x is contained, not 
in A, but in B, then in the same way « € BC, «€ AC + BC, as 
required. Conversely if y belongs to the sum AC + BC, then it 
belongs either to AC or to BC, y€ BC say. But then yECB 
and y € C; further it follows from y € Bthaty € A + Band finally 
that y € (A + B)C. The case y € A C is treated analogously and 
the proof is complete. 

It is to be observed, however, that by no means all arithmetical 
rules carry over to operations with sets. For example, we have 
for sets A, B, C, the formulae 


A+A=A, 
AA=A, 
A+BC=(44+ B)(44+0) 
which are quite unlike the usual arithmetical equations. We suggest 
that the reader satisfy himself as to the accuracy of these for- 
mulae. 
We shall indicate a few further symbols for sums and inter- 


sections of sets. For the union of sets we employ the symbols 5? 
and U, so that, for example, the notation 


S= D/A, or S=UA 
ry=1 yas] 


denotes the union of the sets A,, Ag, ..., A,, «.. 
For the intersection of sets we employ the symbols // and, so 
that, for example 


D=[JA, or D=NA, 


v=l vol 


denotes the intersection of the sets A,, Ay, ..., A,, 


SETS 5 


We turn now to the operation of forming the complement. 

If the set B is a subset of the set A, then the totality of those 
elements of A that do not belong to B is termed the complement 
of B with respect to A and is denoted by CB or A — B. 

We note the obvious formula 


(A —B)+ B= A. 
Note that for two arbitrary sets A and B, the result 

(4+ B)—B=A 
is generally false; at 1s true only when A and B have no common 
elements. 


Of more complicated results we remark the following, which 
will be of frequent occurrence: 


C SB, = I] CB,; (1) 


it can be read as follows: the complement of a union of sets is the 
intersection of their complements. 
We shall give a proof of this result. Let 2 €C 5’ B,; then 


a ¢ 5) B,; this implies that, for any v, « ¢ B,, ie. « € CB,; but 
then x € [[ CB,. Conversely, if « € [J] CB,, then, for any », 


v 
xECB,, ie. x B, for any v; but then x ¢ D'B,, ie. cE CD B,, 
as required. ig g 
Operating again with C on both sides of equation (1) and letting 
A, = CB,, we obtain the result 


SCA, =CIIA,, (2) 


ie. the complement of an intersection of sets ts the union of their 
complements. 

The results cited can be combined in the form of a general rule: 
the complement symbol C can be interchanged with either of the 
symbols >} and [] provided that each of the latter is substituted for 
the other. 


3. EQUIVALENCE OF SETS 


We wish now to establish a rule by which it would be possible 
to compare different sets with respect to the number of elements 
in them. 


6 MATHEMATICAL ANALYSIS 


For finite sets no problem arises here: by counting the elements 
of two finite sets A and B, we can immediately determine which is 
relatively the richer in elements. It is natural to call finite sets A 
and B equivalent if they have the same number of elements. 
However, this definition of equivalence does not carry over directly 
to the case of infinite sets. We shall now give it a form in which 
the extension to infinite sets will be immediate. To this end 
we observe that to establish the equivalence or non-equiva- 
lence of finite sets A and B there is no actual necessity to 
enumerate their respective elements. For example, if the set A 
is the audience in a hall and if B is the set of chairs in the same 
hall, then instead of counting the students and chairs separately, 
it is possible, by assigning to each student a free chair, to deter- 
mine immediately and without any counting whether or not the 
specified sets are equivalent. 

The procedure elaborated in the example given is, in abstract 
terms, the establishing of a correspondence between the sets A and B, 


We introduce the following important definition. If each element 
of the set A somehow determines a unique element of the set B 
while in addition each element of B is determined by one and only 
one element of A, then there is said to be a one-one correspondence 
between the sets A and B, And in this case the sets A and B are 
called equivalent. 

This new definition of equivalence applies to any sets, which 
need not be finite; thus, for example, the infinite set A of natural 
numbers 1,2, ... is equivalent to the set B of negative integers 
--l, —2, ..., since a one-one correspondence between A and B 
is established by the rule: to each number x € A corresponds the 
number —n € B. 

In just the same way the set of natural numbers 1, 2, ... is 
equivalent to the set of all positive even integers 2,4, ...; the 
equivalence between them is effected by the rule » -> 2n. In this 
example we see that a set can be equivalent to a proper subset of 
itself; it is obvious that such a situation can occur only with in- 
finite sets. 

The equivalence relation is denoted by the sign ~. This relation 
is easily seen to be transitive: if A ~ B and B~C, then A~C. 
Jf two sets are equivalent they are also said to be “‘equipotent”’, 
or to have the same “power’’. 


SETS 7 


Fic. 2 Fic. 3 


The set of points of the closed interval [0, 1] is equivalent to 
the set of points of any other closed interval [», 6]; the correspon- 
dence can be established, for example, with the aid of a central 
projection as shown in Fig. 2. Similarly the sets of points of any 
two distinct closed intervals are equivalent.t 

The set of points of an open interval is equivalent to a set of 
points on a straight line (Fig. 3). 

It is not so easy to answer the question as to whether the set of 
points of a closed interval is equivalent to the set of points of an 
open interval. 

There is the following general theorem, which contains as a 
particular application the answer to this question. 


Turorem (F. Bernstein, 1898). If the set A is equevalent to a 
subset of the set B and the set B is equivalent to a subset of the set A, 
then the sets A and B are equivalent. 

Proof. Let us denote by B,, a subset of the set B equivalent to 
the set A, and by A, a subset of the set A equivalent to B. In the 
one-one correspondence B~ A, elements of B, correspond to 
certain elements in A, whose aggregate we shall denote by A, and 
so we obtain the inclusion chain, 

ADA,>A, 


while 4,~A since 4,~B,, By~A. If we show that A~A,, the 
theorem will be proved (since A, ~ B). In the one-one map of A 


+ The closed interval («, 8] is determined by the inequality « <23SB 
(end points are included) and the open interval (x, PB) by the inequality 
a <2 < (end points are not included). 


8 MATHEMATICAL ANALYSIS 


onto A,, the set A, A is mapped onto a certain set 4; Ag, 
the set A, < A, is mapped onto a set A,C Az, the set A, A,is 
mapped onto a set A; C A,, and so on. Furthermore 

the set 4A -- A, maps onto A, — A; 

the set A, -- A, maps onto A, — A, 

the set A, — A, maps onto A, — A; 


and so on. 
From this it follows that the sets 


A — A,, A, — Az, Ay — Az, Ag — Az, ete. 

are equivalent in pairs; the union of the sets, which are disjoint, 
(A — Aj) + (4, — As) + (Ay — As) + 

is then equivalent to the union 
(A, — As) + (Ay — As) + (Ag — Ay) + 


We denote by D the intersection of the sets A, A,, Ap, ... The 
following equations then hold: 


A=D-+ (A — Aj) + (Ay — Ag) + (Ag — As) + (1) 
A, = D + (Ay ~ A;) + (Ag — Ag) + (4g — Ag) + (2) 


We shall prove the first equation. Let a € A; we shall show that 
a is contained in the right-hand side of equation (1). If a is actually 
in each of the sets A,, Ag, ... then a € D, and the assertion is 
proved. If, on the other hand, there are some A,, to which a does 
not belong, let A, be the first such set, so that a € A,_,; but 
then a € A;,_, -- A, and is therefore contained in the right-hand 
side of (1). Conversely if a belongs to the right-hand side of (1), 
then evidently a € A, since each term in the right-hand side is a 
subset of A. 

Equation (2) is proved in exactly the same way. 

Equations (1) and (2) can be written in the form 


A = [D + (A, — Ag) + (Ag — Ay) “| 

+ [((A — A,) + (Ag — Ag) + +], (3) 
A, a [D a (A, = Ay) Tv (A, = A,) +o] 

+ [(4g — As) + (44 — Ag) + +]. (4) 


The first square brackets on the right-hand sides of both equa- 
tions contain the same set, while the second square brackets 


SETS 9 


contain sets which were proved above to be equivalent. It is now 
easy to establish the equivalence of the sets A and A,. To each 
point of the set D + (A, — A,) + (As — Ay) + -- CA let cor- 
respond the same point in the set A,; and to each point a of the 
set (A — A,) + (Ag — Ag) + --- let correspond that point of the 
set (A, — A;) + -- which corresponds to a in virtue of the equi- 
valence between these sets established above. Equations (3) and 
(4) show that this correspondence exhausts ali the elements of the 
sets A and A,. Thus the required one-one correspondence be- 
tween 4 and A, is established. 

Using the Bernstein theorem, it is easy to verify that the point- 
sets of open and closed intervals are equipotent. In fact a given 
closed interval [«, 8] contains a set equivalent to the point-set of 
a given open interval (y, 6) (any open interval) interior to [a, B] 
while the open interval (y, 6) contains a set equivalent to the 
point set of the closed interval [«, 6] (any interior closed interval). 
Applying the Bernstein theorem we have that [a«, 6] ~ (y, 6) as 
required. 


4. COUNTABLE SETS 


Definition. A set equivalent to the set of all natural numbers 
1, 2, ..., is said to be a countable set. 

Alternatively, a set is countable if all its elements can be num- 
bered, thereby exhausting the natural numbers. We shall cite a 
few theorems concerning countable sets. 

1. Every infinite subset B of a countable set A is likewise countable. 
In fact the elements of B can be renumbered according to their 
sequential order in A (in so doing, it will be necessary, since B is 
infinite, to use all the natural numbers). 

2. The union of a finite or countable number of countable sets is a 
countable set. Proof. Let us consider first the case of two sets. Let 
A = {a,, a, ...} and B = {b,, bg, ...} be countable sets. We write 
out all the elements of both sets in a single row as follows: 


A,, by, Ag, bg, Gy, bs. 


Now all these elements can be renumbered according to their 
sequential order in the line. Of course an element occurring twice 
(i.e. one contained both in A and in B) acquires a number on its 
first occurrence and is omitted on its second. Consequently each 
element of the union of A and B is numbered as required. 


10 MATHEMATICAL ANALYSIS 


The set of all integers 0, +1, +2, ... is countable since it is the 
union of two countable sets 1, 2,3, ... and 0, —1, —2, ... 

The theorem is proved similarly for three, four, or in general, 
for any finite number of countable sets. For a countable number 
of countable sets, for instance for the aggregate of sets 


A; = {a11, Aya, +s Bins fads 
Ag Ng 5 Ugg 5.2523 Cane hs 


the only difference will be that the rule for writing all the elements 
of the sets in a single line must be somewhat subtler, for example: 


11; 4q1, Fee, Ayo; 431, 432, 433, Ae3, Ag; 


Aq1> Eye, Ug» Ug, Ggq, Mea, U4; -- 


the remainder of the proof is unchanged. 


3. The set of all rational numbers (i.e. numbers of the form plq, 
where p and q are integers) is countable. 

In fact the set of all rational numbers is the union of the follow- 
ing countable sets: 


(1) the set A, of all integers nm = 0, +1, +2, ...; 


(2) the set A, of all fractions of the form n/2, 
n=0, +1, +2, ...; 


(3) the set A, of all fractions of the form 7/3, 
n= 0, +1, +2, ...; 


r= 0, +1, +2, ase 


The sets A,, 4g, ... Ag, ... constitute a countable set of sets; 
since each of them is countable, their union is also countable in 
virtue of theorem 2, as asserted. 


4d. Tf A = {a,, ...a,, ...} and B = {b,, ...b,, ...) are countable 
sets, then the set of all pairs (a;,, 6,) (k,n = 1, 2, ...) is also countable. 


SETS 11 


In fact the set of all these pairs can be decomposed into a count- 
able set of countable sets 


A, = {(@,, by), (41, Bg), «5 (4, Bn), -}, 
Ay = {(ag, 0,), (dy, bg), «--, (dg, Bn), «--}, 


and by theorem 2 the union of these sets is a countable set. 

This example is capable of a geometric interpretation: to the 

pair (a;,, b,) corresponds the point in the plane with coordinates 
a, 6,, we see therefore that the set of all points in the plane both 
of whose coordinates are rational is countable. 
5. The set of all polynomials P(x) =a) +a,% ++ +a, 2" 
(of arbitrary degrees) with rational coefficients ay, a,, ..., @, i8 count- 
able. The set of all polynomials of the given form is the union of 
a countable set of sets A, (n = 0, 1,2, ...) where A, denotes the 
set of polynomials with degrees < n. Hence, in view of theorem 2, 
it is sufficient to show that each of the sets A,, is countable. For 
n =0 it is a question of the countability of the rational numbers 
themselves, which was established in theorem 3. We shall now 
proceed by induction: we suppose that the set A, has been proved 
countable, and go on to prove A,,,, countable. 

Each element of the set A, ,, can be written in the form 


Q(x) + Anz1v"*?, 


where @ (x) is a polynomial of degree < n with rational coefficients, 
i.e. an element of A,, and a,,, is a rational number. 

The set of polynomials Q(x) is by hypothesis countable, and so 
is the set of numbers a, ,,. Thus, corresponding to each element of 
the set A,,,, it is possible to assign a pair (Q(x), a,,,), in which 
each term runs through a countable set of values. 

Note. It is clear that in the given instance, it is immaterial that 
we actually consider polynomials, i.e. linear combinations of 
powers of x. For example, linear combinations of trigonometric or 
other functions could equally well be considered. In general, in- 
stead of polynomials a, + a, 2 +++ + 4,2” with rational coef- 
ficients, we can consider any array (ap, @,, ... @,), each coordinate 
of which is an element of some countable set; the proof given 
above shows essentially that the set of all such arrays is also a 
countable set. 


12 MATHEMATICAL ANALYSIS 


6. The set of all algebraic numbers (i.e. zeros of polynomials with 
rational coefficients) is countable. 

By theorem 5 we can enumerate all polynomials with rational 
coefficients, and so they will form a sequence 


Py (a), Pg(x), ..., Pyle), ... 


But each of the specified polynomials possesses a certain finite 
number of zeros. Writing in a single line all the zeros of P, (x), 
then all the zeros of P(x), etce., we are able to order the set of all 
algebraic numbers, as required. 


Problems. 1. Show that the sets given below are countable: 
(a) the set of all closed intervals a < « s 6, where the end-points a and 6 
are rational numbers. 
({b) The set of all finite combinations of rational points in the plane. 


2. Show that the following sets are either finite or countable: 
(a) A set of disjoint intervals on a line. 
(b) A set of figure eights in the plane, no two of which intersect. 
(c) The set of points of discontinuity of a monotone function. 
(d) Aset M of real positive numbers, provided that all finite sums 2’ x,,4,¢M, 
are bounded by a fixed number A. 


Hints. (a) A rational point can be chosen in each interval. (b) A point 
with rational coordinates can be chosen in each half of a figure eight. (c) The 
discontinuities [f(e — 0), f(¢ + 0)] of a monotone function f(x) are disjoint. 
{d) Only finitely many points of M can lie outside any interval [0, €]}. 


Note. V.V.Grushin and V. P. Palamodov have established the same 
result for a set of non-intersecting figures in the plane which possess triple 
points (like the letter 7), and also for a set of non-intersecting figures in 
space which contain saddle-points or incorporate Mébius strips as parts of 
themselves. 

3. Resolve the set of natural numbers 1, 2, ... into a countable set of 
disjoint countable sets. 

4. (Riddle). I. X, a mathematician, recently received a visit from his dear 
brothers NV. In the entrance-hall they took off their hats and hung them on 
the stand. When they assembled to leave and began to put on their hats, it 
appeared, to their host’s great confusion, that they were a hat short. Nobody 
had come into the entrance-hall during this time. 


II. When the brothers N paid another visit to X (all with hats), they 
again hung their hats on the stand in the hall. When, on leaving, thev 
started to put on their hats, it turned out that there was a hat too many. 
Both host and guests recollected definitely that until their arrival the hat 
stand had been quite empty. 


ITI. On the next occasion, the guests put on their hats and left, and the 
host, having seen his guests to the street, returned to discover that all the 
hats were hanging on the stand. 


SETS 13 


IV. Finally, on the fourth occasion, the guests arrived hatless, and on 
their departure made use of the hats that remained from their last visit. 
When he had seen off his guests, the host again caught sight of all the hats 
on the hatstand,—the same number as were there before the guests’ arrival. 

What is the explanation of all these paradoxical events? 

See the hint on p. 19. 


5. SETS OF THE POWER OF THE CONTINUUM 


It is found that there exist infinite sets the elements of which 
are incapable of being put into a sequence. Such sets are called 
uncountable. A typical example of an uncountable set is the 
continuum, the set of all points of an arbitrary closed interval. 


THEorEM | (G. Cantor, 1874). The set of all points of the closed 
interval 0 <a <1 ts uncountable. 


Proof. Suppose, on the contrary, that the set is countable and 
they can all be ordered in the sequence 2,, %, %, --. Having got 
this sequence, we construct a sequence of closed sub-intervals as 
follows. 

We partition the interval [0, 1] into three equal parts. Wherever 
the point x lies, it cannot belong simultaneously to all three of 
the intervals [0, 1/3] [1/3, 2/3] (2/3, 1] and so one of them can be 
selected which does not contain x, (neither as an interior point, 
nor as an end-point); we denote this interval by A,. Further, 
we denote by 4, one of the three equal parts of A, that does not 
contain x,. When the intervals 4,>4,>---2A, have been 
constructed in this way, we denote by 4,,,, one of the subintervals 
formed by trisection of A, that does not contain x,,,, and so on. 
By a well-known theorem of analysis the infinite sequence of 
closed intervals A, > A,=> ... has a limit point € This point 
belongs to each of the A, and so cannot coincide with any one of 
the points z,. But this shows that the sequence 2, 2, ... %, «. 
cannot exhaust the points of the interval (0, 1], contrary to the 
original hypothesis. The theorem is therefore proved. 

We saw that all the rational numbers of the interval [0, 1] form 
a countable set. The remaining numbers of the interval are called 
irrational; for example, V1, a/4 ete. are irrational. We see now 
that the irrationals greatly outnumber the rationals; to be precise, 
the irrationals constitute a set known to be uncountable (other- 
wise, if the set of irrationals were countable, the set of all numbers 
in 0 <z <1, as the union of two countable sets, would also be 


14 MATHEMATICAL ANALYSIS 


countable). Furthermore, since the algebraic numbers (the zeros 
of polynomials with rational coefficients) also form a countable set 
(Section 4), the transcendentals, those numbers that are not zeros 
of polynomials with rational coefficients, constitute an uncount- 
able set. 

The argument put forward here proves, incidentally, the actual 
existence of transcendental numbers, which is by no means ob- 
vious a priori. Every set equivalent to the set of all points of the 
closed interval [0, 1] ts said to have the power of the continuum. 

We saw that the point-sets of an arbitrary closed interval [a, 6], 
an arbitrary open interval («, 8), and finally of the whole line 
—co <a“ < o, are equivalent to that of the closed interval [0, 1] 
and consequently have the power of the continuum. The following 
theorems facilitate the identification of new wide classes of sets 
with the power of the continuum. 

The first of these theorems relates to an arbitrary infinite set: 


THEOREM 2. If a finite or countable set B is added to an infinite 
set A, there results a set equivalent to the original set A. 

For the proof, we shall extract at random from A a countable 
subset C’, and let D = A — C. We have 

A=D+C, 

A+B=D+C+B. 
Since the sets C' and B are countable, their union C + B is also a 
countable set; there exists therefore a one-one correspondence 
between C and C + B. Using this correspondence, and extending 
it by making the points of D self-corresponding points in it, we 
obtain the required one-one correspondence between the sets A 
and A + Bf. 

CorotiaRy 1. If a finite or countable set B is extracted from an 
infinite set Q, the remainder A = Q — B is equivalent to Q, provided 
at again constitutes an infinite set. 

This follows immediately from the equation Q = A + B on 
application of the theorem just proved to the set A. 

CoroLuary 2. The set of the irrationals has the power of the con- 
tinuum; likewise the set of the transcendentals. 

Before turning to the theorems that follow, we shall consider 
the so-called dyadic representation of real numbers. 


+ This theorem, incidentally, implies as a corollary the equivalence of 
open and closed intervals, independently of Bernstein’s theorem. 


SETS 15 


We restrict ourselves to the real numbers belonging to the 
closed interval [0, 1]. The point 1/2 divides this interval into two 
equal parts, which we denote by A, = [0, 1/2] and A, = [1/2, 1). 
The interval A, is divided into two equal parts by the point 1/4 
we denote them by Ay, = [0, 1/4] and Ay, = [1/4, 1/2]. Similarly 
the point 3/4 bisects 4,; we put 4,,) = [1/2, 3/4], A,, = [8/4, 1]. 
Continuing the process of bisection we obtain eight intervals 
Agoo> oor: «+> 4111 Of length 1/8, sixteen intervals Ag999;Mo001> --- 
41111 of length 1/16, and so on. The end points of all these intervals 
are of the form p/27, where p and q are natural numbers; these 
points, which clearly form a countable set, are called dyadic 
rationals. The remaining points of the interval [0,1] are called 
dyadic irrationals; the set which they form has the power of the 
continuum. We denote the aggregate of all the closed intervals 
of the system constructed by A. 

For each point & € [0,1], it is possible to find a sequence of 
closed intervals in A, successively embedded one in the other, 
with lengths respectively equal to 1/2, 1/4, ..., 1/2” ..., and con- 
taining the point &. In fact & belongs to one of the intervals Ay, ,; 
if it belongs to Ay, say, then it belongs to 49, or 4p. and so on. 
Thus for every &, we get: 


Me D> Mee, D Aegeaey DP PA neg ty PDE (1) 


(the numbers ¢, are noughts or ones). Having the system of in- 
clusions (1), we can identify with & the sequence of noughts and 


ones: 
Si Bea aasieane (2) 


The symbol (2) determines the dyadic representation of the real 
number &. Given the sequence (2), the number & itself is well- 
defined by the form: 
bail En : 
f= a on (3) 
the partial sums of the series (3) are actually the left end-points 
of intervals appearing in the inclusions (1); it is clear that these 
left end-points form a monotone (non-decreasing) sequence tending 
to the value &. 
It is indeed possible to commence with an arbitrary sequence (2) 
of zeros and ones and construct a number & from formula (3). It 
is easy to see that this number & will be the intersection of the 


16 MATHEMATICAL ANALYSIS 


closed intervals of the corresponding system (1). Thus all possible 
sequences of zeros and ones are involved in our construction. 

If the point ¢ is not a dyadic rational, it determines uniquely all 
the closed intervals occuring in (1): If, on the other hand, it is a 
dyadic rational, then it is a common end-point of two adjacent 
and equal closed intervals of the system A and at some stage of 
our process the choice between them will be arbitrary. If we take 
the one to the right, we shall have to take all the subsequent 
intervals to the left and all the subsequent numbers in (2) will be 
ones. While if we take the one to the left, we shall have to take 
all the subsequent intervals to the right and all the corresponding 
numbers in (2) will be zeros. It is easy to see that, conversely, if 
the terms in (2) from a certain point on are all zeros, or all ones, 
then the number € is a dyadic rational. This follows immediately 
from formula (3). 

The set of dyadic irrationals is therefore in one-one correspon- 
dence with the set of all sequences of noughts and ones that 
contain an infinite number of each; the set of dyadic rationals is 
in one-one correspondence with the set of all sequences where the 
elements are all zeros from some point on, and also with the set 
where they are all ones from some point on. We can now turn to 
the next theorem. 


THEOREM 3. The set of all sequences consisting of zeros and ones 
has the power of the continuum. 


Proof. The set under consideration is the union of three sets: 
the set of sequences containing an infinite number both of ones 
and of zeros, the set of sequences containing only a finite number 
of zeros, and the set of sequences containing only a finite number 
of ones. As we have shown, the first of these is equivalent to the 
set of all dyadic irrationals and therefore has the same power as 
the continuum; the other two sets are countable, since they are 
equivalent to the set of dyadic rationals. In virtue of theorem I, 
the set of all sequences of zeros and ones itself has the power of 
the continuum, as required. 


THeoreM 4. The set of all increasing sequences of natural numbers 
(4) has the power of the continuum. 


O<)k, <kg<e <ky< (4) 


SETS 17 


Proof. Each sequence (4) determines a sequence of zeros and 
ones in which ones occupy the positions numbered k,, kg, ..., ky -.- 
and zeros the remaining positions. This clearly constitutes a one— 
one correspondence between the set of all increasing sequences of 
natural numbers and the set of all sequences consisting in zeros 
and ones. We have shown the second set to have the power of 
the continuum; the theorem follows. 


THeoreM 5. The set of all sequences of natural numbers, 
My, Mg, +, Myy - (5) 


(not necessarily increasing) has the power of the continuum. 
Proof. Each sequence of natural numbers (5) determines an 
increasing sequence, 
ky, =m, ky =m, + Mg, -.; 
kn = mM, ot Me Sc Mn, °° 
It is obvious that this constitutes a one~one correspondence 
between the set of all sequences of natural numbers and the set 
of increasing sequences. As we have shown, the second set has the 


power of the continuum; it follows that the first set has the same 
power. 


THEOREM 6. The set Z of all sequences of real numbers 
€ == (f,, &a, tees En; ws) 
has the power of the continuum. 


Proof. By theorem 5 each value €, determines a sequence of 
natural numbers 


&, > (Pri> Pras -++> Pnk> +) 
and hence the symbol é determines an array 


Pir Piz + Pix + 
Por Pog +++ Pox -- 


But all the elements of this array can be written in a simple 
sequence (cf. Section 4) 


P11; Pars Pees Pras P3i> Pse> P33» Pes» Pis:; 
Pais Paz» Pas» Pas» Psa> Poa Pra» -- 


18 MATHEMATICAL ANALYSIS 


Thus & determines a sequence of natural numbers. Conversely, it 
is clear that each sequence of natural numbers can be obtained in 
this way from some &. We see that the totality of symbols & is 
equivalent to the aggregate of all sequences of natural numbers, 
and so, by theorem 2, has the power of the continuum. 

The proof of an analogous theorem goes through, with correspond- 
ing simplifications, in the case where ¢ is defined by only a finite, 
and not a countable number of coordinates: 


& a (&1, sein Bi): 


If we assume that each of the coordinates é,, ..., &, runs 
through the real line, we obtain the result that the set of all points 
in n-dimensional space has the power of the continuum for any n. 
In particular, the set of all complex numbers (or, what amounts 
to the same thing, the set of points in the plane) has the power of 
the continuum. 

Note. There is no need whatever to regard each coordinate &, 
as just a real number: the theorem holds if €, runs through any 
set with the power of the continuum. 


THEOREM 7. The set C (a, b) of all continuous functions f(x) de- 
fined on the closed interval [a, b| has the power of the continuum. 


Proof. Let 71,12; --, Tn, --» be a sequence of all the rational 
points of the closed interval [a, 6]. We identify with each continuous 
function f (7) a sequence of real numbers—the values of the function 
/(x) at the points 7, 72, ---, Tn, + 


f(x) or {f (14), f(r2), seey f(r), as} = {f (tTn)}- 


Under this correspondence two distinct functions f(x) and g(x) 
will determine distinct sequences {f(r,)} and {g(r,)}, since if two 
continuous functions coincide at all rational points they coincide 
everywhere. Thus the set C (a, b) of all continuous functions can 
be considered equivalent to some subset of the set of all numerical 
sequences. On the other hand, the set of all numerical sequences 
has by theorem 6 the power of the continuum, and is therefore 
equivalent to the subset of C(a, b) comprising just constants. By 
Bernstein’s theorem (Section 3), C(a, 6) is equivalent to the set 
of all numerical sequences, and consequently shares with it the 
power of the continuum. 


SETS 19 


Problems. 1. Show that the set of all continuous functions f(z, y) defined 
on the square {z| < 1, |y| S 1, has the power of the continuum. 

2. The function f(x) (a < x S b) is called a Baire function of the first class 
if it is the limit of a sequence of continuous functions: 


f(e) =lim fn(x),  fa(x) € C(a, 6). 


1-00 


Show that the power of the set of all Baire functions of the first class is 
equal to that of the continuum. 
Hint to problem 4, Section 4. The set of brothers, N, is a countable set. 


6. Sets or HicgHER Powers 


If the sets A and B are not equivalent, but one of them, say 4, 
is equivalent to some subset of B, the set B is said to have a higher 
power than A. Thus a countable set has a higher power than any 
finite set, and the continuum a higher power than a countable set. 

There exist sets of higher power than the continuum. Further- 
more, given a set of a certain power, it is always possible to con- 
struct a set of higher power using the following theorem: 


THEOREM (G. Cantor, 1878). Let a set A be given and let B be the 
aggregate of all subsets of A. Then the set B is of higher power than 
the set A. 


Proof. We assume the existence of a one-one correspondence 
between the sets A and B, in other words, we assume that to each 
element x € A corresponds a uniquely determined subset A, of A. 
The element x may or may not belong to the subset A,; the first 
possibility is realised, for example, by A, coinciding with the whole 
set A, and the second when A, is the empty subset. We shall call 
elements of the first kind “good” and elements of the second kind 
“bad’’. We collect together all the bad elements x, i.e. those that 
do not belong to the corresponding subsets A,, and let Z denote 
their aggregate. In virtue of the one-one correspondence between 
the subsets of the set A and its elements some element € must 
correspond to the subset Z. We examine two possibilities: ¢ is 
either a good element or a bad one. If is good, it belongs to the 
corresponding subset, ic. € € Z. But by construction Z consists 
only in bad elements, so the first possibility is excluded. 

lf € is a bad element, it does not belong to the corresponding 
subset: € ¢ Z. But by construction Z contains all the bad elements; 
hence the second possibility is excluded. 


20 MATHEMATICAL ANALYSIS 


We have obtained a contradiction can be neither good, nor bad. 
Hence the original premise, the equivalence of the set A and the 
set B of all subsets of A, must be false; these sets cannot be equi- 
valent. Since A itself is obviously equivalent to a subset of B (the 
one with singleton subsets of A as elements), we deduce that B has 
a higher power than A. The theorem is proved. 


Examples. 1. It is easy to calculate that a finite set of n elements 
has just 2” distinct subsets. 

2. The set of all subsets of a countable set obviously coincides 
with the set of all sequences of distinct rational numbers, and 
therefore has the power of the continuum. 

3. The set of all subsets of the continuum can be represented as 
a certain set of functions defined on the closed interval [0, 1]. 
Namely, to the subset A of this closed interval corresponds the 
function f 4 (7), equal to 1 when 2 € A and 0 when aw ¢ A (the cha- 
racteristic function of the set A). The set of all such functions 
therefore has a power greater than that of the continuum. And 
certainly the set of all functions on the closed interval fa, b] has a 
power higher than that of the continuum. We recall that the set 
of continuous functions on a closed interval has the power of the 
continuum (Section 5, theorem 7). 


Concluding remark 


The fundamental ideas of set theory were first formulated at 
the end of the 19th century in the works of George Cantor (German 
mathematician 1845-1918), and since that time have pervaded 
various regions of mathematics, consummating its language quite 
remarkably. For a more detailed account we recommend Mengen- 
lehre by F. Hausdorff (De Gruyter, Berlin, 1935) English trans- 
lation Set theory (Chelsea, New York, 1957) and A. A. Fraenkel’s 
Foundation of set theory (Amsterdam, 1958). 


CHAPTER II 


METRIC SPACES 


1. DEFINITION AND ExamMpLES oF Mztric Spaces. ISoMETRY 


One of the most important concepts in mathematical analysis 
is that of the limiting process; it underlies such fundamental ana- 
lytic operations as differentiation and integration. 

A number z is said to be the limit of a sequence of real numbers x, 
if the distance between x, and 2, i.e. the modulus of the difference 
x — x, tends to zero as n > co. Thus the concept of the passage 
to the limit is based on the possibility of measuring the distance 
between points on the real line. Similarly the concept of the limiting 
process in the plane or in multidimensional space is based on the 
possibility of measuring the distance between points of the corre- 
sponding sets. We introduce the further concept of a metric space; 
this term will apply to an aggregate of objects to which are assigned 
mutual “distances” that fulfil certain intrinsic conditions. These 
“‘distances’’ facilitate the investigation of the properties of the 
limiting process “‘in its pure form”’, i.e. independently of the cha- 
racteristics of the particular elements. 


1. Definition. An arbitrary set M of elements (‘“‘points”’) x, y, ... 
is termed a metric space if: (1) there is a rule that for any two points 
x,y determines a number g(x, y) (‘the distance from x to y”’), 
(2) this rule fulfils the following requirements (axioms): 


(1) ey, «) = e(a, y) for any x and y (the symmetry of distance) ; 

(2) e(a, y) > O when x + y; o(x, x) = 0 for any a; 

(3) o(z, 2) So(#, y) + ofy, 2) for any wz, y,z2 (the triangle in- 
equality). 

Examples. 1. Any set M on the real line FR, constitutes a metric 
space with distance-function @ (x, y) = |« — y|. Similarly a set MU 
in the plane R, or in three-dimensional space R, is a metric space 
if we consider as the distance between points the usual geometric 
MA. 2 21 


22 MATHEMATICAL ANALYSIS 


distance; for points « = (&,, £,, &3), y = (m1, 42, 43) In Rg 
e(t,y) = VE — mi? + 2 — 12) + Es — 75)". 


The triangle inequality (axiom 3) is in this instance the usual 
geometric inequality: the third side of a triangle does not exceed 
the sum of the other two sides. 

Similarly, in n-dimensional space R,,, the distance between points 
x= (€, -5 &:),y¥ = (M1, ---, Yn) can be defined by the formula 


ole, y) -\/z (& — (1) 
2 


so that any set M in n-dimensional space constitutes a metric space 
with distance-function (1). 

Axioms 1 and 2 are evidently satisfied here. To verify that. 
axiom 3 is satisfied we apply passa inequality} 


abs sa) oy, 
j=l1 a" 


which holds for any a,, ..., d, 0,, ..., 6,. Substituting in this ine- 
quality a; = § —n;, bj = "i — ¢;, we find that 


0? (x, z) = Py (§ — oP = PAG — nj) + (n 7, — oP 


= 5 &— aP42 3 — my ~ 0) + 3-0" 


J= 


IA 
tMs 
aay 
| 
3 
— 

to 
+ 
bo 
ao 
"e 
| 
3 
oe, 
“Ss 
ih 
s 
~ 
wy 
“S 
+ 


1 - 
_Siv—or-( Zar Soa) 


= [o(x, y) + ely, 2)}*, as required. 


t We give a proof of this inequality. We put A = 3'a?, B= J 0?, 
C= J’ a,b, we have to prove C2< AB (*) 
this inequality will be satisfied if the quadratic polynomial 

P(A) =AB+2CA+8B 
does not have distinct real zeros. But 
P(A) = Sar #P4 2S a,b A+ LF = D(a, A+ 5,), 


so that P(A) cannot have more than one real zero, namely A = —b,/a, = 
= —b,/a,. Thus the inequality (*) holds. 


METRIC SPACES 23 


2. In problems of analysis, spaces occur in which the elements are 
functions (functional spaces). 

The choice of a particular metric in functional spaces depends on 
the requirements of the problem. When there is a geometric dis- 
tance, it is clear that those elements that are close together must be 
considered close in terms of the metric. In analysis it is necessary 
for the most part to proceed the other way about: it is clear from 
the conditions of the problem what elements it is natural to regard 
as close together, and a definition of distance is introduced accord- 
ingly. 

For instance, it is often natural to regard continuous functions 
x(t) and y(t) (a St <b) as close together if max |x(t) — y(t)| 

asissh 

is small. This quantity can be taken as a definition of the distance 
between x(t) and y(t); such a definition obviously satisfies axioms 
1-3, and consequently any set M of functions, defined and conti- 
nuous on the closed interval [a, b] constitutes a metric space with 
metric 

g(x, y) = max |x(t) — y(?)]. (2) 

astsb 

3. In some cases (for example, in the calculus of variations), 
which involve functions that have derivatives up to the kth order, 
it is natural to consider close together those elements x(t) and y(t), 
for which the values, not only of the functions themselves, but: 
also of their derivatives up to order k, are close together. The 
distance formula 


o(x, y) = max {la(t)— y(t)|, |a’) — yO], os Jet) ~ y I} 
astsb 
al (3) 
meets this requirement, A set of functions x(t) with continuous 
derivatives up to order & evidently constitutes a metric space with 
the metric (3). 
4. In other cases (for example, in the theory of integral equations), 
it is natural to regard functions #(¢) and y(t) as close together if 
they are close in the integral sense, i.e. if 


b 
flz@ -— y@lde 


Q* 


24 MATHEMATICAL ANALYSIS 


is small. It is natural in this case to define distance by the formula 
b 
g(x,y) = f lx(t) — y@| de. (4) 


The axioms for a metric space are then clearly satisfied. 

5. Sometimes it is necessary to define the proximity of functions, 
not linearly in terms of their difference, but in terms of some power 
of their difference; distance can then be given by the formula 


b 
oP (x,y) = f |a(t) — y(|Pde. (5) 


For p = 1, this definition satisfies the axioms for a metric space, 
though (with the exception of the simple cases p = 1 and p = 2) 
the verification of axiom 3 becomes rather involved; we shall not 
give it here. 

Thus the definition of a metric space is sufficiently elastic to 
meet the very diverse concrete requirements of mathematical ana- 
lysis. In reviewing our whole course, we shall be convinced of the 
accuracy of this observation. 

The metric space of all functions continuous on the closed inter- 
val a St <b, with metric defined by (2), is denoted by C(a, b). 
The metric space of all functions continuous on [a, 6], and having 
continuous derivatives up to order k, with metric defined by (2), 
is denoted by D,,(a, 6); we put Dy (a, b) = C(a, b). The metric space 
of all functions continuous on [a, 6] together with the metric (5) is 
denoted by C, (a, 6). 

2. The inequality expressed in axiom 3 can be generalized for the 
case of any m elements 2, 2, ..-, Xm, in the form 


Q (21, %m) SO(*,, Xe) + Og, Xs) + + + O(@m—1; Tm)- 
This inequality follows from successive applications of axiom 3: 
0(%,, Xm) Ss o (x, Xy) +E 0 (Xe; Lm) Ss O(%y,; Xp) oh 0 (Xe, Xg) mc 
+ (xs, Lm) see 


We observe the following simple property of metrics, which may 
be termed the ‘“‘quadrangle inequality”: for any four points a, y, 2, 
u of a metric space, we have 


le(z, 2) — ety, u)| Sel, y) + ef, u). (1) 


+ Cf. Chapter IV, Section 5, art. 3. 


METRIC SPACES 25 


Geometrically, this means that the difference between a pair of 
non-covertical sides of a quadrangle never exceeds the sum of such 
a pair. The proof follows from the inequalities 


o(z,2z) Se(z,y) + ey, u) + e(u, 2), 
oy, &) Sely, x) + e(x, 2) + ef2, u), 


if from the first we subtract g(y, u) and from the second 9(z, z). 
When z =u the quadrangle inequality reduces to the second 
triangle inequality. 


le(x, 2) — ey, 2)| Selz,y), (2) 
which also has frequent application. 


3. In the theory of sets the concept of equivalence played an 
essential part. Equivalent sets, i.e. sets in one-one correspondence 
with each other, were from the pure set-theoretic point of view 
absolutely identical, although they were composed of completely 
different elements. It was shown, for example, that the set of 
points of a closed interval and the set of functions defined and 
continuous on that interval have the same power, and so, in set- 
theoretic terms, there is no distinction whatever to be drawn be- 
tween these sets. 

But if we are considering two sets that constitute metric spaces 
(and interest us in this capacity), set-theoretic equivalence is not 
a sufficient criterion for us to consider them identical as spaces, 
since they may possess quite distinct metric relations. For instance, 
the metric spaces formed by the points of the closed interval [a, 5}, 
and the functions continuous on this interval, though set-theoretic- 
ally equivalent, are dissimilar in respect of their metrics, because 
in the first space distances between elements are bounded by the 
constant 6 — a, while in the second they are not bounded at all; 
many other differences could also be indicated. The following defi- 
nition then suggests itself: 

Two metric spaces are said to be isometric if it is possible to set 
up a one-one correspondence between their elements that preserves 
the distance between corresponding pairs of elements. 

In other words, if M and M’ are isometric spaces and to elements 
x,y € M correspond elements x’, y’ € IM’, then o (x, y) = o(2', y’). 

For example, the spaces C(0, 1) and C(0, 2) of functions 
continuous on the closed intervals [0, 1] [0, 2], respectively are 


26 MATHEMATICAL ANALYSIS 


isometric. The correspondence between their elements can be 
expressed in the formula 


CO, 13 2c eo y(t) = «(5) € C(0, 2). 


It is easily seen that this correspondence is one-one and distance- 
preserving as required. 


2. OPEN SETS 


1. The aggregate of all points x of a metric space IZ whose distance 
from a given point 2, is less than a fixed magnitude r > 0, so that 


Q(%, %) <7, 


is termed a sphere (more precisely, an open sphere) of radius 7; the 
point 2 is said to be the centre of this sphere. The aggregate of all 
points x that satisfy the inequality 


O(%,%) Sr, 


is termed a closed sphere of radius r. Finally, the set of all points 
situated precisely at a distance r from the point 2), so that 


(x, 2p) =f, 


constitutes a spherical surface of radius r with centre 2. We now 
give the following important definition. 

A set U in a metric space Jf is said to be an open set or region, if 
each point x2) € U is an interior point of U, i.e. if it is the centre of 
some open sphere contained in U (the radius of which will, in 
general, depend on 2). 

Thus the open sphere 


U = {x:e(x, #1) < 7} 


with centre at some point 2, is an open set. To see this, let x) € U 
so that @(x), x,) = 6 <r. We consider a sphere Uy with centre xy 
and radius ry <r — 0, and show that U, is contained in U. In 
fact, for any x € Uy, we have by the triangle inequality 


O(%, %) SeE(x, X) + O(%,%)<%m +O6<r—O4+0=r7r, 
as required. 


Operations with open sets. The union of any number of open 
sets is evidently also an open set. 


METRIC SPACES 27 


The intersection of any finite number of open sets is an open set. 
For let the point x, belong to the open sets U,, U,, ..., Um, and let 
it be contained in the first of these together with a sphere of radiusr, 
(with centre x,), in the second together with a sphere of radius rz, 
etc.; then the sphere with centre x); radius min (7,, ..., 7m), is con- 
tained in each of the sets U,, .... Um, and is therefore contained 
in their intersection as well. 

For an infinite number of open sets, the foregoing reasoning 
breaks down, since the minimum (more precisely, the exact lower 
bound) of an infinite set of positive numbers may be zero. And in 
fact the intersection of the infinite aggregate of open sets 


0, = fel Xo) < “| (n = 1, 2, ...) 


contains only those points x for which e (x, 79) = 0,ie., by axiom 2, 
only the point x); this intersection is not, therefore, an open set. 


2. On the real axis — co < a < ov, every open interval («, A) 
(bounded or unbounded) is evidently an open set. The union of a 
finite or countable number of open intervals («,, B,) (v = 1, 2, ...) 
without common points is also an open set. We shall show that 
every open set U on the real axis is the union of a finite or countable 
number of disjoint open intervals. 

We consider an arbitrary point x € U. By definition, U contains 
together with x some sphere, i.e. some open interval on the real 
axis that contains x. We now construct the largest open interval 
that contains x and is contained entirely in U. 

We denote by S the set of points that lie to the right of x and do 
not belong to U. If S is empty the entire half-line (x, 00) is con- 
tained in U. If S is non-empty it possesses an exact lower bound &. 
This point € cannot belong to U, since any point in U has a neigh- 
bourhood which lies entirely in U and cannot therefore contain a 
single point of S. In particular € + a. It is also clear that the whole 
open interval (x, €) is contained in U. 

We proceed similarly with points lying to the left of « and obtain 
an open interval (yj, x), contained in U, the left-hand end of which 
does not belong to U (the interval may be the whole of the half-line 
{- oO, x)). 

Thus, given the point x € U, we have formed the interval (n, &) 
belonging to U and such that its end-points (of which either or 


28 MATHEMATICAL ANALYSIS 


both may lie at infinity) do not belong to U. Such an interval is 
said to be a component open interval of the open set U. 

If two component intervals (7, €,), (yz, &) have a common 
point x, they coincide; for the inequality &, < &,, say, is impossible, 
since ; must, on the one hand, as an interior point of interval 
(%, &), belong to U, and on the other, as an end-point of the 
interval (z,, £,), cannot belong to U. 

The whole set U is therefore the union of disjoint component 
intervals. This union is at most countable, since in each of the 
component open intervals of U, a rational point can be chosen, 
and the rationals form a countable set. Our proof is therefore 
complete. 


Problems. 1. If a set H on the line is covered by an arbitrary system of 
open intervals, an (at most countable) subsystem can be extracted which 
also covers LE. 


Hint. Of the open intervals that constitute the covering, select those with 
rational end-points, and discard one by one those intervals in the covering 
that contain a selected interval. 

2. If a set H in the plane is covered by an arbitrary system of open discs, 
an (at most countable) subsystem can be extracted which also covers H. 

Hint. Ci. problem 1. 

3. A metric space M is said to possess a countable base if there exists a 
countable system of open sets U,, Ue, .... Uy, ..., such that for any point x€ M 
and any region U that contains , there exists k such that « ¢ U,c U. Show 
that a theorem analogous to those of problems 1 and 2 holds in any metric 
space with a countable base. 

4. Prove that the set #, of interior points of any set HZ is open (provided 
it is non-empty). 

5. Prove that the aggregate of all open sets on the line has the power of 
the continuum. 


Hint. Use theorem 6, Chapter I, Section 5. 


3. CONVERGENT SEQUENCES AND CLOSED Szrts 


1. We shall say that a sequence of points x,, %, ..., %, .. of a 
metric space M converges to a point x of that space if 


lim g(x, 2) = 0, 


P->OO 


or alternatively, that the sequence 2,, v,, ... converges to x if, 
given any sphere with centre 2, it is the case that all points of the 
sequence from some point onwards lie inside it. 


METRIC SPACES 29 


The point x is said to be the limit of the sequence x1, %, ..., Ly, .- 
and is denoted by lim z,. 

It is not difficult to show that if the limit x exists, it is deter- 
mined uniquely. In fact, if we had 


lim e(z,x,) = 0, lim o(y, z,) = 0, 
v->0O 


v-> OO 
then for any given e > 0, we could find a number WN such that the 
inequalities 
€ € 
O(%%) <>, OY %) <> (vy SN) 
would hold, and so, by the triangle inequality 


a(t, y) S O(%, 2) + oly, %) Sitix<e. 


Since in the inequality obtained ¢ is arbitrarily small, 
o(x,y) =0 
whence, in virtue of axiom 2, 


ct=Y, 


as required. 


Examples.1. Let a metric space M in n-dimensional space R, 
(Section 1, example 1) be such that the distance between the 
points x = (&,, ..., &n), y = (m4, +) Mn) is given by the formula 


o(z,y) = VG — 0). 
We shall explain what is meant by the convergence of the sequence 
a, = (E), &, .., EM) (vy = 1,2, ..) 
to the point 
x = (&,, ..., &)- 
Since 
e(e, x) = VE; — EP), 


o(x, x,) tends to zero as » -> oo if and only if all numerical sequences 
EY), ED, ..., EY (v = 1,2, ...) tend to the limits &,, &, ..., &, re- 
spectively as »—> oo. Briefly, convergence in #,, is convergence 
with respect to all coordinates. 


MA. 2a 


30 MATHEMATICAL ANALYSIS 


2. The convergence of a sequence of functions x(t) € C(a, b) to 
a function x(#) means that, as » > oo, 


o(%, x,) = sup |a(t) — a,(t)|>0. 
astSb 


In analysis such convergence is termed uniform convergence. 
3. The convergence of a sequence of functions x, (t) in the space 
C, (a, b) to a function x(t) means that, as » > oo, 


b 
oP (x, 2) =f |x() — a,(t)|Pdé>0. 


In analysis such convergence is termed convergence in the mean of 
order p, or if p is fixed, simply convergence in the mean. 

Every uniformly convergent sequence is clearly also convergent 
in the mean for any p. 

But it is easy to construct a sequence of functions, convergent in 
the mean for any p, but not uniformly convergent. For instance, 
let «,(#) be a function taking values between zero and unity, 
non-zero only in an open interval A,, of length less than 1/y, and 
taking the value 1 at some point of this interval. Clearly 

b 


1 
f uP(t)dt<—, 
y 
a 
so that thesequence is convergent inthe mean to zero. But max &, (f) = 1 
for any, v, so the sequence does not converge uniformly to zero. It 
can be shown that this sequence does not converge uniformly to 
any function at all. Moreover the intervals A, can be so chosen 
that the sequence fails to converge for a single value of t. 


Lemma. If x, > x, y, > y, then @(%,, yy) > @ (x, y), (distance ts a 
continuous function of tts arguments). 
Proof. By the quadrangle inequality (Section 1, art. 2) 


le (a, y) ~~ Q (xy, Y»)| = (2, y) + oly, Yv) » 
The right-hand side tends to zero as y > oo, as required. 


2. A point x of a metric space UM is said to be a limit point of a 
set F< M if there exists a sequence 2,, x, ... of (distinct) points 
of F convergent to the point z. 

Another definition of a limit point, obviously equivalent to the 
above, runs as follows: a point x is a limit point of a set F if any 
sphere with centre x contains points of F (distinct from 2). 


METRIC SPACES 31 


The subset F c M is said to be closed if it contains all its limit 
points. 


EHazamples. 1. The interval a <a <b on the real line is closed, 
but the interval a <2 < 6 is not, since it does not contain its 
limit point 0. 

2. In any metric space, the sphere 

U = {v:0(0,m) <7} 


is a closed (and is therefore termed a closed set sphere). 

For let us take any point z,, not belonging to U, so that 
0 (#1, %) = 7, >7r. We show that the sphere, centre x, radius 
1/2 (r, — r), contains no points of U; for if it contained such a 
point, then, denoting it by z, we should have 


I 
O(%, %) Se(%, 2) + o(%,2) Sr ane! —rn<ry 


in contradiction to our construction. It follows that x, cannot be 
a limit point of the set U. 

Closed sets in a metric space M bear a close relation to open 
sets of the space. We have the following theorem: 


THerorEeM. A set U, complementary to a closed set F with respect 
to a metric space M, is always open. A set F, complementary to an 
open set U, is always closed. 

Proof. Let F be a closed set and U its complement; we shall 
show that U is open. We consider an arbitrary point x) € U; we 
have to show that there exists a sphere defined by an inequality 
of the form 

Q(%, X%) <7, 


and contained entirely in U. 

If this is not the case, we must assume that any sphere with 
centre z, contains points of F. But then, according to the second 
definition of a limit point, x, is a limit point of F, and we must 
have x) € F, which contradicts the hypothesis z, € U. U is there- 
fore open. 

We turn now to the second part of the theorem. Let U be an 
open set and F its complement; we shall show that F is closed. 
Any point x, that belongs to U is the centre of some sphere con- 
tained in U and cannot, therefore, be a limit point of F. So the 
limit points of F can lie only in F itself, and consequently F is 


2a* 


32 MATHEMATICAL ANALYSIS 


closed. The theorem is proved. 

Recalling Section 2, art. 2, we obtain a general characterisation 
of all closed sets on the line —oo <x < oo: every closed set on 
the line can be represented as the complement of a finite or countable 
aggregate of disjoint open intervals. These intervals, which are the 
component open intervals of the complement of the closed set, are 
called the contiguous open intervals of the closed set. 

Using the known properties of open sets in a metric space and 
the relation we have just found between open and closed sets, 
we can prove that the union of a finite number of closed sets and the 
intersection of any number of closed sets are again closed sets. 

For let closed sets F, be given (vy running through some set of 
indices), and let U, = CF, be the complementary open sets. By 
formula (1), Section 2, Chapter I, we have: 


CSF, = [[ CF, =]] U,. 


If y runs through a finite set of indices, [] U, is, in virtue of Sec- 
tion 2, an open set; hence the set 5’ F,, which is the comple- 
ment of [7 U,, is closed. Moreover, by (2), Section 2, Chapter I 
CIF, = CF, = 2) U,. (5) 
v v v 
The set 5) U, is always open; hence its complement //F, is always 
closed, as required. 


3. A set A contained in a metric space M is said to be every- 
where dense in M if each point 6 € UM is the limit of a sequence of 
points a, € A (not necessarily distinct), In other words, A is 
everywhere dense in M if any sphere whose centre is a point 
6 € M contains a point a € A. 

Thus the set of rationals is everywhere dense on the line 
—oo <a < oo. By a well-known theorem of Weierstrass, every 
continuous function f(x) on the closed interval [a,b] can be 
expressed as the limit of a uniformly convergent sequence of 
polynomials; hence the set of all polynomials is everywhere dense 
in the space C (a, b). 

The property of being everywhere dense possesses a peculiar 
“transitivity ”’: if a set A is everywhere dense in a space M and 
M in turn is everywhere dense in a larger space P, then A, con- 
sidered as a subset of P, is everywhere dense in P. For since M 
is everywhere dense in P we can find for a given x € Pande > 0 


METRIC SPACES 33 


a point 6 € M such that 9(b, x) < ¢/2, and since further, A is 
everywhere dense in M we can find a point a€ A such that 
o(a, 6) < ¢/2. By the triangle inequality (a,x) <0 (a,b) + o(b, x) 
< e. Thus in any sphere of radius ¢ with centre at a point « € P 
there exists a point a € A, as required. 


Example. Let us show that the totality of trigonometric poly- 
n 

nomials T(x) = D} (a, cos kx + 0, sin kx) constitutes a set every- 
k=0 


where dense in the space C,, (— 2, 2). It is well-known from analysis 
that every continuous function f(x) on C,(—2x, 2) which possesses 
a piecewise-continuous derivative and satisfies the condition 
f(—2) = f(z) has a Fourier series development 


foe) 

f(x) -2 + Dd} a, cosna + b,sinn2z, 
n=] 

which converges uniformly, i.e. with respect to the metrics of both 

the spaces C(—z, x) and C,(—2, 2). 

Thus the trigonometric polynomials form a set A which is every- 
where dense in the set M of continuous functions specified. Further, 
any continuous function g(x) on the closed interval [—z, a] that 
satisfies the condition g(—) = y(z) is the limit of a uniformly 
convergent sequence of functions f,(7) in M, for instance, of the 
polygonal lines which coincide with g(x) at the points ka/n 
(tk = 0, +1, ..., +n). Finally any continuous function g(x) is the 
limit of a sequence of continuous functions g,,(z) with ,(—2) 
= @,(%) which converges in the metric of the space C,(—z, 2); 
for example, we can put g(x) =g(x) for |*|Sa —I1j/n, 
Gn(—s) = n(x) = 0, and define ¢, (x) on the remaining intervals 
(—x, —% + 1fn), (% — 1/n, x) linearly. In virtue of the transitivity 
mentioned, we get that the set of trigonometric polynomials is 
everywhere dense in the space C,(— x, 2), as required. 

4. Let there be given an arbitrary subset A of a metric space U/; 
we denote by A the union of the set A with all its limit points (not 
contained in it). If A is a closed set, then A = A;and conversely, 
if A coincides with A, then A contains all its limit points and is 
therefore closed. In the general case the set A is said to be the 
closure of the set A. From the definition of closure it follows easily 
that a given point b € M belongs to the set A if and only if there 
exists in any sphere with centre b a point a € A (possibly coincident 


34. MATHEMATICAL ANALYSIS 


with b). In particular, it is clear that A is everywhere dense in its 
own closure; and conversely if A is everywhere dense in some 


set Q, then Qc A. 
Utilising this last observation, we shall show that the closure of 


an arbitrary set A < M is always closed. In other words, the closure A 
of the set A coincides with A itself. We know by hypothesis that A 


is everywhere dense in A and that A is everywhere dense in A; 
hence in accordance with art. 3 A is everywhere dense in A and 


therefore A is contained in the closure of A, ie. A> A; but this 


means that A is closed. 
We observe further that every closed set F that contains A 
must, in addition, contain all the limit points of A, and hence the 


whole set 4. Since, as we have shown, A is closed, it can be charac- 
terised as the smallest closed set that contains the set A. 


Examples. 1.'The closure of the set A of all rational points on 
the real line is the aggregate of all the points (rational and irratio- 
nal) on the line. 

2. The closure in the space C (a, b) of the set of all polynomials 
P(x) =a +4, ... + a, x2" is the whole space C (a, 6). 


Problems. 1. Given any set A, we denote by A’ the set of its limit points. 
Construct a set A on the line, such that A” = (A’)’ is non-empty but A’” 
is empty. 

2. Prove that the set A’ (cf. problem 1) is closed, whatever the set A. 

3. Given that A‘ is countable, show that A is countable (assuming A to 
be a subset of the line). 

4. Prove that the result of problem 3 holds in any metric space with a 
countable base (Section 2, problem 3). 

5. A point x on the line is said to be an accumulation point of an uncount- 
able set A if every neighbourhood of x contains an uncountable subset of A. 
Show that every uncountable set A has points of accumulation; moreover, 
almost all its points, except perhaps for a countable set, are accumulation 
points. 

Hint. A point which is not an accumulation point of the set A can be 
covered by an interval with rational end-points and containing an at most 
countable number of points of the set. 

6. Show that the result of problem 5 is valid in any metric space with a 
countable base. 

7. Prove that the set Mf of points in the plane lying on the unit circle 
with centre the origin and having polar angle coordinates I, 2, ..., n, ... is 
everywhere dense on I’. 

Hint. Tf an are dj CI does not contain any points of the set M, then 
neither do the arcs A, = A, — 1, 4, = Ay — 2, etc. The ares Ay, A;, Ag, ---5 


METRIC SPACES 35 


of equal length, and lying on I’, cannot be mutually disjoint. If, say, the 
arcs A, and A,,,, intersect, then the union of A,, 44m Ip4ams - COVEIS 
the whole of I’, which is impossible. 

8. The quantity 


@ (w, A) = inf @ (2, y) 
YEA 


is said to be the distance of the point x from the set A. Show that for a 
closed set A the relations 


o(x,A)=0, xeEA, 


are equivalent; but if A is not closed, this is not the case. 

9. Show that for any set A the aggregate of points x for which e(a, A) < € 
is open, while the aggregate of points y for which e{y, A) S ¢ is closed. 

10. Show that every closed set F on the line is the intersection of a count- 
able set of open sets. Similarly for a metric space with a countable base. 

Hint. Put U,, = {x: @(2, F) < 1/n}. 

11. Given two disjoint closed sets F,, F,, construct two disjoint open sets 
U,, U,, such that Uy D F,, U,D Fy. 

Hint. Put U, = U {y: @(a, y) < 1/20(a, F.)}; similarly for U,. 

“EF, 

12. Show that every open set is the union of a countable set of closed sets 
(on the line or in a space with a countable base). 

13. Show that the projection of a plane, closed, bounded set on the line 
is a closed set. Is it necessary to assume boundedness? 

Hint. As an example, consider the projection of a rectangular hyperbola 
on an asymptote. 

14. A metric space I is said to be separable if it contains a countable set A 
which is dense in it (so that the closure of A coincides with the whole of Jf). 
Show that the existence of a countable base is equivalent to separability. 

15. Show that the spaces C(a, b) and D,,(a, 6) are separable. 

Hint. The set of polynomials with rational coefficients can be taken as a 
countable, everywhere dense set. 


4, CompLETE SPACES 


1. A sequence of points 7,, 22, ..-, 2, -- in a metric space M 
is said to be fundamental if, for any e > 0, there exists a number N, 
such that for u,v > N, 


O(%,, %,) Se. 
In such instances, we shall use the abbreviation 


lim e(%,, z,) = 0. 
vk» OO 


For example, any convergent sequence is fundamental. 
For by the triangle inequality, 


0 (%y, Ly) = o(%,, x) + 0 (x, Ly), 


36 MATHEMATICAL ANALYSIS 


and if x, -» x, then for sufficiently large yu, v, the right-hand side 
becomes smaller than any preassigned e. 

If M is the real line with the usual metric, the concept of a 
fundamental sequence of points coincides with the classical con- 
cept of a fundamental numerical sequence. In the theory of real 
numbers there is a criterion due to Cauchy according to which 
every fundamental numerical sequence converges. 

Cauchy’s criterion is not valid in a general metric space. 

We consider the open interval (0, 1); it represents a metric space 
with the usual metric for the real line. The sequence 1/2, 1/8, ..., 1/n, ... 
is clearly fundamental in this space, but does not converge in it. 
Cauchy’s criterion is not, therefore, applicable in the metric space 
(0, 1). 

2. Thus in a general metric space it is not possible to employ 
Cauchy’s criterion. If the criterion turns out to be valid in certain 
particular metric spaces, this is due to the special properties of 
these spaces. We distinguish the class of such spaces by means 
of the following definition: 

A metric space M is said to be complete if, in it, every funda- 
mental sequence converges. 


Examples.1. We show that the n-dimensional space R, with 
metric 


e(x,y) = VG — nj)? (& — nj)? 


is complete. Let 2, = (€@, ..., &%) be a fundamental sequence. 
Since 
w &) ( ~~ 
|&” = ale Ss + \é” = eel = O7 (ty, Ly), 


the numerical sequence &” (y = 1, 2, ...) is fundamental for every 
fixed j = 1, 2, ...,, and as such has some limit €;. The numbers 
€,, &, ---, &, determine a vector x € R,,. Since 


jo — a, |? = (6; EMP +0 as poo, 


the vector x is the limit of the given fundamental sequence. 
Hence every fundamental sequence in the space &,, converges 
in it. 

2. The space C(a, 6) is complete. In fact, if the sequence of 
functions y, (x) € C(a, b) is fundamental, then as yu, vy > oo, 


sup | Yp (x) — Yy(x)| > 0. (1) 
asarsb 


METRIC SPACES 37 


We fix x; in virtue of (1), the numbers y,, (x) form a fundamental 
numerical sequence, which by the classical Cauchy criterion must 
converge. Let y(x) be the limit of y,(x) as » > oo. Letting » > oo 
in the inequality 


sup |y,(z) — y,(z)| Se (v4 > N = N(e)) 
asrab 


We obtain 
sup |y(*) — y,(v)| Se (u>N = N(e)). (2) 


asxsb 
Hence y(x) is the limit of a uniformly convergent sequence of 
continuous functions y,,(x) and by a well-known theorem of analy- 
sis is therefore continuous. It follows further from (2) that 
oy; ¥,) > 9, so that in the space C(a, b) every fundamental se- 
quence is convergent; C'(a, b) is therefore a complete space. 

3. A closed subset A of a complete metric space M, when con- 
sidered as an independent metric space (with a metric borrowed 
from MM), is itself complete. For every fundamental sequence 
y, € A converges in M (since M is complete), and its limit belongs 
to A since A is closed. Conversely if a subset A c M is known to 
be a complete metric space, then A is closed in M. For if A were 
not closed in M we could find a sequence {x,} in A that converged 
to some point y € M — A. But the sequence {z,} is fundamental 
(since the metric in A is borrowed from M), and since A is com- 
plete, it must possess a limit in A. We should then have a sequence 
with two distinct limits, one in A, the other outside A, which is 
impossible. 

4. The space C,, (a, b) is not complete for any p. For the proof we 
consider a sequence of continuous functions y,(x), taking values 
between 0 and 1, and, as » -> oo, tending uniformly to 0 on every 
open interval (a, ¢ — ¢), and to 1 on every open interval (¢ + «, 6) 
(where c is a fixed point between a and b). This is a Cauchy se- 
quence, since 

b c—eé e+e b 

f lyol@) = Yule) [Pda = f ++ f + f Se+2e+e=4e 

a a c—e e+e 
for sufficiently large u,v. But at the same time the sequence y,(x) 
is not convergent in the mean to any continuous function. 

To prove the last assertion we make the following observation. 
If a sequence of functions f,(x) (v = 1, 2, ...) ts convergent in the 


38 MATHEMATICAL ANALYSIS 


mean over an interval A = {a Sa <b} to a continuous function 
f(x), and also converges uniformly to a function (x) in some interval 
6 = {ec Sx Sd} interior to A, then in the interval 6, p(x) = f (2) 
identically. For in the space C,(c, d) we have the relations 


b 
0? (fe f) N= flies — f(w)|Pdx < f |i,(e) — f(w)|/Pdx 0, 


0” (f,. = fli — v(x) |Pda < max|f,() — y(w)|P(d — c) +0. 
xed 


And in virtue of the uniqueness of limits (Section 2), we have 
H(2) = pla). 

If we suppose that the sequence y, (x), yo(x), ..., y,(%), ... con- 
structed above is convergent in the mean to some continuous 
function f(x), then by what we have proved we must have 
f(x) =0 for aSau<ce, f(x) =1 for c<a2 <b. But there is 
obviously no value f(c) for which the function f(x) will be con- 
tinuous over the closed intervala Sa <b. 


Problems. 1. Taking the metric 
g(x, y) = [tanta — tan y| 


on the line {— 0 < x < oo}, verify that all the axioms for a metric space 
are satisfied. Is this space complete? 

Answer. The space is incomplete; the sequence x, = n (n = 1, 2, ...) is 
fundamental but has no limit. 

2. Show that the space D,,(a, 6) is complete for all m. 

3. Is the space of all numerical sequences 


@ = (Fry Say rs Ens ovr) 


where &, > 0 as n -> co, with metric given by 


0 (2, y= max, |&,— Nn| 


complete? 

Answer. Yes. 

4. Consider the following three spaces of functions on the real line 
(a) all bounded continuous functions; 


(b) all continuous functions for which lim f(x) = 0; 

| a] 00 
(c) all continuous functions, each identically zero outside some open interval. 
Take as a metric in these spaces 


o(f, ¢) = sup, |f(z) — g(x)|. 


Are the given spaces complete? 
Answer. Complete in cases (a) and (b), incomplete in (c). 


METRIC SPACES 39 


3. Lemma concerning closed spheres. The completeness of the 
real line is used in analysis to prove a well-known lemma about 
embedded closed intervals: a sequence of embedded closed inter- 
vals has a common point. An analogous result holds in a complete 
metric space with closed spheres substituted for closed intervals: 

Lemma concerning closed spheres: In a complete metric space, 
a sequence of embedded closed spheres, 


U, = {y= 0(Y, W) sn}, y=1,2,..., 


the radii r, of which tend to zero as » > 00, has a common point. 
Proof. For all y, the centre y,,,, of the sphere U,,,, lies in the 
sphere U,, so that 


o(Ys YW +p) s Ty, 


hence the sequence of centres y,, Yg, ---, Yy, -.- is fundamental. 
Since the space, Mf say, is complete, this sequence has a limit yp. 
Since the sphere U, is a closed set in M (Section 3, art. 2, example 2) 
and the points y,, y,.1, .-. belong to it, yy = ae a Yo » also belongs 


to it; the number » here is arbitrary, hence a “pelengs to all the 
spheres U,, as asserted. 

In Chapter I, using a theorem on embedded closed intervals, 
we proved that the point-set of the closed interval [0,1] was 
uncountable. We can now reason analogously for a wide class of 
complete metric spaces. 

First we give the following definition: a point 2, of a metric 
space WM is said to be isolated if there exists a sphere (x, 7) S46 
containing no points of M with the exception of the point 2, itself. 
Thus if UM is some point-set on the line —oo < x < oo with the 
usual metric, x) € M is an isolated point if there exists an open 
interval containing v,, but not containing any other point of M. 


THEOREM 1. Every complete metric space M without isolated 
points is uncountable. 

Proof, Let us suppose, on the contrary, that the points of If 
form a countable set. Then they can all be arranged in a sequence 
Ly, Ly, ++, My, Let y, be a point of MW distinct from 2; we 
denote by U, a sphere with centre y, not containing (either 
within, or on the boundary) the point 2,. Since y, is not isolated, 
this sphere contains other points of M. Let y, be an interior 
point of U,, distinct from 2,; we denote by U, a sphere with 


40 MATHEMATICAL ANALYSIS 


centre y,, contained entirely in U, and not containing the point x. 
Continuing this process, we obtain a sequence of spheres U, > U, 
> ..., the radii of which tend to zero, and in which the sphere 
U,, does not contain the point x,. The common point é of all the 
U,,, which exists in virtue of the lemma just proved, cannot 
coincide with any of the points x, %, ..., % ..., in view of our 
construction. Hence the sequence x,, x2, ... cannot exhaust the 
whole space M, contrary to our hypothesis; the theorem is proved. 

We saw above (example 3) that a closed set in a complete metric 
space is itself a complete metric space. Hence in a complete metric 
space any closed set without isolated points is uncountable. 

Note. If we reject the hypothesis that the space M contains no 
isolated points, the theorem ceases to be true. Any countable 
closed set (e.g. a convergent sequence) on the line provides a 
suitable counter-example when considered as an independent 
metric space. 

We consider now a closed set F on the real axis. It is evident 
that every isolated point of F is a common end-point of two open 
intervals contiguous to F, or, what amounts to the same thing, 
is a common end-point of two component open intervals of the 
complementary open set. 

A closed set without isolated points is said to be perfect. We saw 
(Section 3) that every closed set on the line can be obtained by 
extracting some set of disjoint open intervals; it is now apparent 
that a perfect set can be obtained by the extraction of some set 
of open intervals, not merely disjoint, but also without common 
end-points. 

Any closed interval [a, 0] will serve as an example of a perfect 
set. But it is easy to construct a perfect set that does not contain 
a single closed interval as a subset. In particular, we consider the 
construction of the so-called Cantor set. 

The Cantor set on the closed interval [0,1] is constructed as 
follows. We first extract the open interval (1/3, 2/3) of length 1/3, 
which forms the middle third of the whole closed interval. Then 
we proceed similarly with each of the two remaining closed inter- 
vals [0, 1/3], [2/3, 1], ic. we extract the middle third from each, 
the open interval (1/9, 2/9) from [0, 1/3], and (7/9, 8/9) from 
[2/3, 1]. A similar procedure obtains for each of the four remaining 
closed intervals [0, 1/9], [2/9, 1/3], [2/3, 7/9], [8/9, 1], and the pro- 
cess is continued indefinitely. As a result, we are left with a closed 


METRIC SPACES 41 


set called the Cantor set. Since the intervals extracted in the con- 
struction had no common end-points, the Cantor set is perfect. 
And since of every pair of adjacent open intervals with vertices 
at the points p/3%, (p + 1)/3%, (p + 2)/3%, one at least was extracted, 
the Cantor set cannot contain a single closed interval. Nevertheless, 
in virtue of the theorem just proved, the Cantor set is uncountable. 

More precisely, the power of the Cantor set is the power of the 
continuum. This is a consequence of the following theorem, which 
applies to any perfect set on a closed interval: 


THEOREM 2 (G. Cantor). Every perfect set on the closed interval 
{a, b] has the power of the continuum. 


Proof. We shall assume that the points a, 6 are exact bounds of 
the set F; then, since F is closed, we have a € F, bE F. 

If the perfect set F contains even one closed interval [«, f], the 
assertion of the theorem is obvious. We shall consider the case 
when F' contains no closed intervals at all. In this case, between 
any two open intervals contiguous to F there is another contiguous 
open interval, and so the number of contiguous open intervals is 
infinite. Since F contains no isolated points, the intervals contiguous 
to F are not only disjoint, but also have no end-points in common. 

The points of F can be divided into two classes, the first class 
containing points which are end-points of open intervals contiguous 
to F, termed points of the first type, and the second containing all 
the remaining points, termed points of the second type. Since the 
aggregate of open intervals contiguous to a perfect set is countable, 
the set of points of the first type is also countable. By theorem 1, 
the whole set F is uncountable, whence it follows that points of 
the second type always exist (which was previously not at all 
obvious) and form an uncountable set. We now use another con- 
struction, which is independent of theorem 1 and is based on a 
special property of the line; it will again be evident that points of 
the second type exist, and we shall see furthermore that they con- 
stitute a set with the power of the continuum. 

We establish an order-preserving one-one correspondence be- 
tween the set of open intervals contiguous to F and the set of 
dyadic rationals on the closed interval [0,1]: if the contiguous 
open interval J’ lies to the right of A’, the corresponding dyadic 
rationals are connected by the inequality r’’< r’. For example, 
such a map can be constructed as follows. We first map the 


42 MATHEMATICAL ANALYSIS 


contiguous open interval A, of greatest length (if there are several of 
these, we choose one at random) onto 1/2. Of all the contiguous 
open intervals lying to the left of 4,, we choose the greatest A, 
(with the same stipulation if there are several of these) and map 
it onto the point 1/4; similarly of those lying to the right of A,, 
we choose the greatest A, and map it onto 3/4. On each of the four 
resulting closed intervals (to the left of A,, between A, and A,, 
between A, and A,m, and to the right of 4,) we choose the greatest 
open interval and map the intervals chosen onto the numbers 1/8, 
3/8, 5/8, 7/8, respectively. Order is clearly preserved under this 
mapping. Continuing this process indefinitely ,we ultimately arrive 
at the desired order-preserving map between the set of contiguous 
open intervals of F and the set of dyadic rationals [0, 1]. 

We now extend this map, on the one hand to the points of F, 
on the other to the dyadic irrationals of the closed interval [0, 1}. 
Let & € [0, 1] be a dyadic irrational. It divides the dyadic rationals 
in [0, 1] into two classes; a left K, and a right K,. The set of all 
contiguous open intervals of F is likewise divided under the corre- 
spondence into two classes, D, and D,. We denote by 7; the exact 
upper bound of the points contained in open intervals of the class 
D,, and by 7, the exact lower bound of the points contained in 
intervals of D,. 

Between each rational point of [0,1] and the point &, other 
rational points can be found; hence between each contiguous open 
interval of F that lies to the left of 7 and the point 7, itself can 
be found other contiguous open intervals; similarly, between each 
contiguous open interval of F that lies to the right of y, and 7 
itself can be found other contiguous open intervals. Hence the 
points 7; and 7, are neither interior nor end-points of contiguous 
open intervals; they are therefore points of the second type. 

We can now state that 4, = 7,, for if this were not the case we 
could find a contiguous open interval separating 7, and 7,, but 
for this reason belonging neither to D, nor to D,, which is impossible. 
We denote the common value of 7; and y, by 7 and map 7 onto the 
selected dyadic irrational ¢. The principle of the correspondence 
&—+vy is thus established. If ¢’ + &’, a dyadic rational can be 
found between &' and &’; the corresponding contiguous open inter- 
val will separate the corresponding points 7’ and 7’, which are 
therefore distinct. Hence the correspondence between the dyadic 
irrationals on [0, 1] and some subset of F is one-one, i.e. F contains 


METRIC SPACES 43 


a subset F*, equivalent to this set of dyadic irrationals, and there- 
fore has the power of the continuum, which completes the proof. 

It is clear from the proof that the set F* contains only points 
of the second type. We claim that every point of the second type is 
the image of some dyadic irrational €. For the point 4 determines 
a division of the aggregate of all open intervals contiguous to F 
into two classes; a left D, (intervals to the left of 7) and a right D, 
(intervals to the right of 7). At the same time the set of dyadic 
rationals is divided into classes K, and K,. The number é = sup K, 
= inf K, is easily seen to be a dyadic irrational and to have as its 
image precisely the point 7. 

The transition from the closed interval [0, 1] to the set F can 
now be represented as a continuous deformation in which every 
dyadic rational point is extended to a whole open interval and 
every dyadic irrational point remains a point, the order of corre- 
sponding intervals and points being preserved. 

Finally, a closed set in a metric space is said to be nowhere dense 
if it does not contain a single sphere. An arbitrary set A in a metric 
space is said to be nowhere dense if its closure A is nowhere dense. 


Problems. 1. Show that every closed set (on the line or in a space with a 
countable base) is the sum of a perfect set and an at most countable set. 

Hint. The perfect component is the set of accumulation points of the given 
closed set. Cf. problems 5, 6 of Section 3. 

2. Prove that a complete metric space cannot be represented as the count- 
able sum of its nowhere dense subsets. 

Hint, Use the method of proof of theorem 1. 


5. THEOREM OF THE FrxEep Point 


1. Let us suppose that we have a function, defined on a metric 
space M, that maps each point y of the space onto a point z = A (y) 
of the same space. A is then said to be a reflection of the space M 
into itself. 

In analysis it is often necessary to deal with distinct reflections 
of functional spaces. For instance, if f(z, y) is a given continuous 
function of its arguments in the regiona <a Sb, -—w<y< ~, 
it can be used to construct a reflection of the space C(a, b) into 
itself by means of such formulae as 


Aly(x)] = f(z, y(z)), 
Aly(z)] = 4% + iG y(é))d& (where 2, yp are fixed). 


44 MATHEMATICAL ANALYSIS 


Every point y that is mapped onto itself by a reflection A (ie. 
for which A y = y) is said to be a fied point of the reflection A. 
Many problems in analysis concerning the existence of solutions 
to different equations reduce to the question as to whether certain 
reflections have fixed points. For example, a theorem on the exist- 
ence of a solution to the differential equation 


SY = fey) (1 


with initial conditions 7 = 2%, y = Yo, essentially concerns the 
existence of a fixed point in the reflection 


Ay(«) = yo + f FE, y(8)) dé, 


since equation (1) with the given initial conditions is equivalent 
to the equation 


y (x) = yo + f f(g, y(é)) dé. 


There are many general theorems that establish the existence of 
a fixed point in a reflection A on the strength of particular assump- 
tions about A. We give here one of the simplest theorems of this 
kind, which gives not only the existence but also the uniqueness 
of a fixed point, admittedly with quite strong limitations imposed 
on the reflection under consideration. 

Definition. A reflection A of a metric space M into itself is said 
to be compressive if for any two points y, z € M, the inequality 


e(Ay, Az) S0eQly,2), 
holds, where @ is a fixed positive number smaller than 1. 


THEOREM. A compressive reflection A of a complete metric space M 
into itself has a fixed point, which is, moreover, unique. 

Proof. Proceeding from an arbitrary point y, € M, we construct 
a sequence of points 


91 = AYo, Yo = Ay = A*Yg, -s Ye = AY = AYO, - 
We contend that this sequence is fundamental in /, since for any » 


O(Yr> Yrs1) = O(A’ yp, A’*1 yo) SO O(A" 1H, A’ HY) SZ 
S 6" o(Yo, Yi) 


METRIC SPACES 45 


and hence 


OY» You) SOY Yv4r) + O(Yv412 Yorn) +o + O(Yv4u—1> Yv +p) 
= 8" 0 (Yo. 41) + 8" ** (Yo, Yr) + + 8°**-1 Oyo, Y1) 


6” 
SOPOT Ee OME MT S --)O(Yos 1) = FOO W)s (2) 


by choosing » sufficiently large, we can make this quantity as 
small as we please. Since M is complete, there exists a limit 


y = lim y,. 
v—>oo 
We show that y is a fixed point. We have 
e(4 y, Y) = (AY, A Ya) 59 O(Y W-1) +9, 


whence it follows that the sequence y, converges to A y. Since the 
limit is unique, A y = y, as required. It remains to show that the 
fixed point obtained is the only fixed point of the transformation A. 
Let us assume that z is a second fixed point, so that in addition 
to the equation A y = y, we have Az = z. Then 


o(y,z) =e(Ay, Az) SO oely,2). 


If o(y, z) > 0, we can cancel by @(y,z) and we obtain the con- 
tradiction: 1< 6. Hence we must have o(y,z) =0, y = 2, ie. a 
second fixed point, distinct from y, cannot exist. The theorem is 
proved. 

Note. It is useful to estimate the distance from some point y, 
to the fixed point y. To do this, we proceed to the limit as pp > co 
in inequality (2): 


0 
oY; Yoxu) Ss Tugel 41) (2d) 


using the continuity of the distance-function (Section 3), we get: 


6 
OY» ¥) SHO Wo M)- (3) 


This is the estimate that interests us. In actual problems it 
permits us to assess in advance the number of steps necessary to 
calculate y with a given accuracy. 

Setting » = 0 in (3), we get 


1 
(Yo ¥) S 7 Geo 1); 


46 MATHEMATICAL ANALYSIS 


this inequality provides us with a bound on the distance between 
the original point y, and the fixed point. 


2. We shall now demonstrate the application of this theorem to 
the problem of the existence and uniqueness of a solution to the 
differential equation 


d 
Go a Hey) (4) 


with initial conditions « = x), y = y,. As we remarked above, this 
problem concerns the existence and uniqueness of a fixed point in 
the reflection 


Ay(x) = yo + f HE y(6) dé. (5) 


The function f(a, y) is assumed to be continuous in the region 
asx <b, —w<y < o. The point 2 is an interior point of the 
closed interval [a, b]. 

The reflection (5) is defined in the metric space C(a, b) of all 
continuous functions on the closed interval [a, 6]. This space is, 
as we have seen, complete; we have only to ascertain under what 
conditions the operator A is compressive. 

To this end, we estimate the distance in the space C(a, b) be- 
tween the elements that result from operating with A on the 
elements y (x), 2(x): 


o(4y, 42) = max [A y(n) — Ax(2)| 
aszsb 


== max fue y(&)) — £(&, 2(€))] dé |. 
asasb Lo 


We suppose that the function f(x, y) satisfies the inequality 
If (@, 91) - £@,¥2)| SKly — ysl (6) 


for any x € [a, 6] and for any values of y,, y,, with K a fixed con- 

stant. The inequality (6) is termed the Lipschitz condition. If the 

Lipschitz condition is satisfied, we then have for any € € [a, b]. 
[f(& (€)) — HE, 2(€))| S Kly(€) — 2(6)| S K ely, 2) 

and hence 


o(Ay, Az) S max f Kety,adé = K(b — a)oly, 2). 


asusb Xo 


METRIC SPACES 47 


We see that the operator A is compressive if the interval [a, 5] 
containing the point x, is sufficiently small, so that 


K(b—a)=6 <1. 


In this case we can apply the theorem of the fixed point, with 
the result: if the function f(x, y) satisfies the Lipschitz condition (6), 
then equation (4) with the initial conditions y (a9) = Yo has a solution 
y = y(x) in some neighbourhood of the point xy, which is moreover 
unique. Thus we have proved a fundamental existence and unique- 
ness theorem for the solution of the differential equation (4). 

By a similar procedure it is also possible to prove the continuous 
dependence of the solution on the initial conditions. We agree first 
to call the reflections A, B e-proximate if, given ¢ > 0, the in- 
equality @(A x, Bx) < « holds for any 2 in the space M. 


Lema. In a complete metric space M let two compressive reflec- 
tions, A, B, be given, so that 


o(A2z,Ay) SOa0(%,y), o(Bx, By) S Oz 0(2,y), 


and let 6 = max (64, 03) < 1. Then if A and B are e-approximate 
their fixed points will be situated at a distance not exceeding e/(1 — 6) 
apart. 

Proof. Let y be the fixed point of the reflection A. In accordance 
with the general theory, the fixed point of B can be obtained as the 
limit y, of the sequence yp), ¥, = BYo, Yo = By,, ..., where, by 
what we have proved 


But since A and B are e-approximate, 0 (Yo, ¥1) = e(A Yo, BY) <e, 
whence @(Y9, Yo) < e/(1 — 6) as required. 

The theorem on the continuous dependence of the solution of 
a differential equation on its initial conditions can be formulated 
as follows. 


THEOREM. If equation (1) is considered on the closed interval [a, 6] 
of length less than 1K, where K is the constant of the Lipschitz condt- 
tion for f(x,y) then for any e > 0, the inequality |yy ~ y,|<e 
implies the inequality 

€ 
a0 ee or tl 
Suen | Yo (x) ¥1(2)| = ios K(b ae a) ? 


as 


48 MATHEMATICAL ANALYSIS 


here y;(x) denotes the solution that satisfies the initial condition 


Yj (%o) = yj (9 = 0, 1). 
Proof. We consider the reflections 


z xz 
Ayla) =yo+ frey@)ds, By) = + f i, yl) aé- 
ZX XZ 
They are both compressive, with the same parameter = K(b — a). 
Tf |y¥% — ¥,| < the reflections A, B are obviously e-proximate. 
By the lemma, the distance between their fixed points does not 

exceed e/(1 — 0) = e/[1 — K(6 — a)], as required. 


Problems. 1. Formulate and prove by the fixed point method an existence 
and uniqueness theorem for the solution of the system of differential equa- 
tions 

dy; Z 
ae i f(x, Yrs cre Yn) I= 1, 2, sony Me 


Hint. The metric space VM is composed of “vector functions” y = (y,(z), 
wy Y, (%)) with the metric 


o(y,2)= max {|yi(x) — 21(a)|, ++» [Yn (2) — 2, (2) |}- 
aszsb 


2. The reflection A on the half-line 1 S x S co maps each point x onto 
x -+ 1/x. Is the reflection compressive? Does it have a fixed point? 

Answer. Although the reflection A diminishes distance, so that o(A 2, A y) 
<e(xy), there is no value @ < 1 that satisfies the inequality @(4 x, A y) 
0 o(x, y) (for all x, y). The reflection is not compressive, and it has no 
fixed point. 


6. COMPLETION OF A MeErric Space 


In Cantor’s theory of real numbers, irrationals are defined with 
‘the help of fundamental sequencest. With the usual metric the 
rationals form an incomplete metric space, and according to Cantor, 
the process of constructing the real numbers can be regarded as 
the process of constructing a complete metric space M that in- 
cludes the space M. Cantor’s method, properly generalized, allows 
the inclusion of any incomplete metric space M in some complete 
metric space M. 


TuHroreM. (¥. Hausdorff, 1914). Let M be a metric space (in 
general incomplete). There exists a complete metric space M, called 


+ Cf. for example, V. Nyemitski, M. Sludskaya, A. Cherkassov, Course of 
Mathematical Analysis, vol. 1, M-L., 1957. 


METRIC SPACES 49 


the completion of the space M, which possesses the following properties : 


(a) M is isometric with some subspace M, c M, 
(b) M, ts dense in M. 


Any pair of spaces M,, M, that satisfy conditions (a) and (b) are 
isometric with each other. 
Proof. We shall call two fundamental sequences {y,}, {z,}, in 
the space M, equivalent if lim o(y,, z,) = 0. For example, every 
v->0O 


pair of sequences in M that converge to the same limit are equivalent, 
but no pair that converge to different limits are equivalent. Two 
fundamental sequences that are equivalent with a third are also 
equivalent with each other. Hence all fundamental sequences that 
can be constructed in M can be divided into classes, such that all 
the sequences belonging to a given class are equivalent with one 
another, and no sequence that is not a number of that class is 
equivalent with any sequence belonging to it. From such classes, 
which we shall denote by Y, Z, ..., we can construct the new space 


M. All that requires definition is the distance between classes Y, Z. 
We define it by the formula 


o(Y, Z) ae es OY, Zy)y (1) 


where {y,} is any fundamental sequence in the class Y, and {z,} 
any one in the class Z. Of course it is necessary in the first place 
to make sure that the given limit exists and is independent of the 
choice of the sequences {y,}, {z,} in the classes Y, Z. By the quad- 
rangle inequality (Section 1, art. 2) 


10 (Yy, 2) — OY sn Zen) = oly, Yv+u) + 0(%,, Zy4u)s 


hence the numbers o(y,, 2,) form a Cauchy sequence. The limit 


lim 0(y,, 2,), therefore exists. If {y,}, {z,} are other fundamental 
#-> 00 
sequences in the classes Y, Z, we find, on applying the quadrangle 


inequality again, that 
le (Ys 2) — @(Ys» %)| SOW Yr) + Ol, %) +9, 


so that the sequence o(y;,, z,) has the same limit as the sequence 
o(y,, 2,). Thus the definition of distance between classes does not 
depend on the choice of fundamental sequences in these classes. 


50 MATHEMATICAL ANALYSIS 


We must now verify that the quantity 
o(Y, Z)= lim o(Yss Zy) 
v—>oo 


satisfies axioms 1-3, Section 1. 

Axiom 1: 0(Y, Z) = @(Z, Y) holds by construction 

Axiom 2: 0(Y, Z) > 0 for Y + Z, 0(Y, Y) = 0. 

First of all we have o(Y, Y) = 0 by our construction of the func- 
tion g, since we can put y, = z, in formula (1). 

Suppose now that 0(Y, Z) = 0. This means that for arbitrary 
fundamental sequences {y,}, {z,}, in the classes Y, Z respectively, 
lim (y,, 2,) = 0. But then the sequences {y,}, {z,} are equivalent 
and the classes Y, Z must coincide. Hence, if g(Y,Z) = 0, Y = Z; 
it follows that for Y == Z, we must have e(Y, Z) > 0, as required. 

Axiom 3: 0(Y, U) Se(Y,Z) + e(Z, U). Let {y,}, {z,}, {u,}, be 
fixed fundamental sequences in the classes Y, Z, U, respectively. 
The required inequality is obtained by passing to the limit in 

O(Yr» Uy) SO(Yss 2) + O(2yr Uy)- 

We now show that all the assertions formulated above in the 
theorem on completion hold for the space M. 

(1) M contains a subset M, isometric with the space M. We map 
each element y € M onto the corresponding class Y c M, that 
contains the sequence y, y, y, --. (i.e. the class of all sequences con- 


vergent to y). If under this mapping the point y corresponds to the 
class Y and the point z to the class Z, then 


o(Y, Z)= lim @(y, z) = oly, z). 
And it follows that the aggregate of the classes Y is a subset of the 
space M isometric with M. 

(2) M, is dense in M. Let Y be an arbitrary class in M and let 
{y,} be a fundamental sequence in Y. We consider the sequence of 
classes Y,, Y,,..., Y,, .., where Y, is determined by the 
sequence (Y,,Yy, Yu» ---), ie. it is the image of the element y, in 
the map M > M,. 

For a given e > 0, we can find a number py such that, for u > fg, 
we have 0(Y,, Yz+p) <6. We shall then have 


o(Y, ¥) = (Y, Yn) Se. 


But this signifies that the class Y is the limit of the classes Y,,. 


METRIC SPACES 5l 


Since Y,, belongs by construction to the set M,, it follows that M, 
is dense in M. 

(3) M is a complete space. Let Y,, Y., ... be a fundamental 
sequence of elements of M. For each class Y, we find a class 
Z, < M, such that o(Y,, Z,) < 1/», and let z,€ M be the element 
corresponding to Z,. We claim that the sequence {z,} is fundamental 
in the space M. For 


0 (2p; %,) = e(Z,, Zi) = 0(Z,, Y,) + o(Y,, Y,) = o(Y,, Z,) 


=o0(Y,, rte +770 YOO, [> 00. 


The fundamental sequence {z,} determines a certain class Zc M : 


we shall show that the class Z is the limit in M of the sequence Y,. 
For a given ¢ > 0, we have for sufficiently large » = v9: 


: 1 
o(Z, Y,) Se(Z, 4,) + e(Z,, Y,) s lim 0 (2,; 2y) nae. <é. 
> 00 


Hence every fundamental sequence Y, c M has a limit in M, as 
required. 

(4) Any metric space MV that has the properties (1)—(3) is isometric 
with the space M. . 

For let M,, M,, be subsets of the spaces I, M, respectively, that 
are isometric with the space M and consequently isometric with 
each other. We must extend this isometry from M, and M, to the 
spaces M, M. Let us take any element Yc M and consider a 
sequence of elements Y, Cc M, that converges to Y. The corre- 
sponding sequence Z, c M, is always fundamental, since in virtue 
of the isometry between M, and M, the distances between elements 
of the sequence Z, are the same as those between the corresponding 
elements of the sequence Y,. Since MM is complete, it contains an 


element z = lim z,. We map this element onto the chosen element 
yoo 


YcM.Itis uniquely determined since equivalent sequences in MN, 
correspond to equivalent sequences in M, and the replacement of 
a sequence Y, by an equivalent one produces, under the mapping, 
the replacement of the sequence Z, by an equivalent one. The corre- 
spondence indicated is one-one and exhausts the elements of M@ 


and M. It remains for us to show that it is isometric. Let the 


52 MATHEMATICAL ANALYSIS 


elements Y, Y'€ M correspond to elements Z, Z’ € Y, with 
Y=limY,, Y'=limY) (Y,, YieM,). 


If Z,, Z, € M, are the images of Y,, Y,, theno(Z,, Z}) = 0(Y,, Y,) 
and in virtue of the lemma on the continuity of the distance-func- 
tion (Section 3, art. 1) 


o(Z, Z') = lim 0(Z,, Z;) = lim 0(Y,, Y;) =e(Y, Y), 
P—>co vr—>Cco 


as required. Our proof is now complete. 

Note 1. Let us suppose that a given metric space M is a subset 
of another complete metric space M/*. Then we can take as the 
completion of M the closure M of the set M in the space M*. For 
M, being a closed subset of the complete space M*, is itself a 
complete space; hence it contains M as a dense subset. It thus 
fulfils the conditions of the theorem just proved, and can therefore 
serve as the completion of the space M. 

Note 2. The space C, (a, 6) of continuous functions on the closed 
interval [a, b] with distance given by 


b 
oP (y, 2) = fly@) — z(x)|Pda 


is incomplete (we saw this in Section 4, art. 1). By the theorem 


just proved it has a completion Cp (a, 6). The question naturally 
arises: is it possible to attach some concrete significance to the 


elements of the space C,,(a, b), which are abstractly determined 
by theorem 1; can they be interpreted in the form of some func- 
tions or other? It turns out that this can be done, though not very 
easily; we shall postpone the consideration of this question until 
Chapter IV, when we shall have the necessary tools to resolve it. 


7. Continuous Functions anp Compact Spaces 
1. Definitions and Elementary Properties 


A function f(x), defined on a metric space VW and taking numeri- 


cal values, is said to be continuous at the point x, if, for any « > 0, 
there exists 6 > 0 such that the condition 


o(2, %q) < 6 


METRIC SPACES 53 


implies 
[f(z) — f(x)| <e. 

As in classical analysis, a second definition equivalent to the first 
is possible: the function f(x) is continuous at the point x, if for 
any sequence 2, > 2%, we have f(z,) > f(z,). A function which is 
continuous at each point of the set M is said to be continuous on M. 

One of the simplest examples of a continuous function in a 
metric space is the distance from a point x to a fixed point a. 
The continuity of this function follows from the second triangle 
inequality (Section 1, art. 2): 


|o(', %) — o(&", %)|So(v', 2”). 


The usual properties of functions continuous on the line, which are 
familiar in analysis, are easily carried over to the case of functions 
continuous on metric spaces. Thus the sum, difference, and pro- 
duct of two continuous functions are also continuous functions. 
The quotient of two continuous functions is also continuous at all 
points where the denominator is non-zero. 

We shall establish the following important properties of conti- 
nuous functions on a metric space: 


Lemna. If f(x) is a continuous function, then the sets 
F, = {x: f (2) = A}, 
P, = {: f(e) =A) 
are closed for any A, and the sets 
U, = {x: f(x) < A}, 
U, = {x: f(x) > A} 
are open for any A. 

Proof. We shall show that the set U, is open. Let aw € U, so 
that f(x) < A, and let us pute = A — f(x). Since the function f(x) 
is continuous at the point x), there exists a sphere e(%, x) <6 
in which the inequality |[f(x) — f(x9)| <e holds. Within this 
sphere 

f(x) < f(%) +e=A4, 
and hence the whole sphere belongs to the set U,. Since 2 is an 


arbitrary point of U,, it follows that U, is an open set. Its com- 
plement is the set F,, which is therefore closed. The proof proceeds 


MA. 3 


54 MATHEMATICAL ANALYSIS 


similarly for the sets U, and F,; or with the substitution of 
2A — f(x) for f(x), the problem reduces to the foregoing one. 

Conversely, if it is known that for some function f(x) defined on 
a metric space M, each of the sets 


U, = {x: f(z) < A}, U, = {u: f(x) > A} 


is open for any A, then the function f(x) is continuous. 
For in this case, for any point x, € M and anye > 0, we can form 
the sets 


U, = {a: f(x) < f(a) +e}, Us, = {x: f(x) > f(x) — a}, 
which are stipulated to be open. The intersection of these sets 
= {x: f(x) —& < f(a) <f (ao) +e} = (2: —€ < f (2) — f (0) < ¢} 


is also open. It oe contains the point a) and with it some 
sphere U(x) = {2:@(x, %) < 6}. Within this sphere the ine- 
quality 


If (#2) — f(%)| <e, 


holds, but this also denotes the continuity of the function f(x) 
at the point 2. 

A corresponding proposition can be formulated for the sets 
F, = {x: f(z) SA}, Fy = {a: f(z) = A}; if, for a function f(x) 
each of these sets ts closed for any A, then f (x) is continuous. The proof 
can be obtained by taking complements in the above. 

Functions that are defined on a metric space which is itself 
composed of functions, such as the spaces C(a, b), D, (a, b), ete. 
are generally termed functionals. 


Problems. 1. Are the following functionals on the space C(a, 6) continuous? 
(a) F(y) = y(@); 
(b) F(y) = max |y(z)|; 
(c) F(y) = gg y (x); 


(d) Fy) = f y(a) de; 


0, if y(x) assumes a single negative value, 
(ce) F(y) = 41/2, if y(x) = 
1, if y(z) = 0 and y(x) = 0. 


Answer. Continuous in cases (a)—(d), but not in (e). 


2. Are the following functionals on the space D,(a, b) continuous? 
(a) F(y) = y(a); 


METRIC SPACES 55 


b a: 
(b) Fy) = f Vi tye. 


Answer. Yes. 
3. Is the function g(x, B) = inf oe(x, y) (Section 3, problem 8) continuous? 
yeB 


Answer. Yes. 

4. Is the function d(x, B) = sup o(x, y) continuous? 
y€B 

Answer. Yes. 


2. Compact Spaces 


The analogy with the properties of continuous functions on a 
closed interval is not always maintained under the transition to a 
general metric space. For example, a continuous function on the 
closed interval [a, b] is always bounded on this interval and attains 
its upper and lower bounds. A continuous function on a sphere of 
radius r in a metric space M may be unbounded or may not attain 
its bounds (an example is given in one of the problems of the pre- 
sent paragraph). For specified properties of continuous functions 
to hold in a given metric space M, the space must suffer the im- 
position of further restrictions. 

A metric space M is said to be locally compact if every infinite 
subset A < M contains a fundamental sequence. 

Thus every infinite subset A of the interval a < x < 6 has by 
the well-known Bolzano—Weierstrass theorem, a limit point in 
[a, b] and therefore contains a fundamental sequence; we see that 
the interval M = (a, b) is a locally compact metric space. 

A locally compact metric space need not be complete, as we saw 
with the open interval (a, b) in the preceding example. A metric 
space that is both locally compact and complete is said to be 
compact. 

An independent definition of compactness can be given: a 
metric space M is compact if every infinite subset contained in it 
contains a convergent sequence. 

A typical example of a compact space is the closed interval 
a@ =a <b on the real line with the usual metric. 

The properties of continuous functions in being bounded and 
in attaining their bounds are directly connected with the compact- 
ness of the sets on which they are defined. 


THEOREM 1, Every continuous function f(x), defined on a com- 
pact space M, is bounded. 


3* 


56 MATHEMATICAL ANALYSIS 


Proof. Let us suppose that f(x) is unbounded. Then for any 
integer m we can choose a point 2, € M for which |f(a,)| > 2. 

By hypothesis the sequence of points 2, 2, ...,%,, ... must 
contain a convergent subsequence: rejecting a subset of these points 
if necessary, we can assume that the sequence itself converges 
to some point 2) € M. By the continuity of f(x), there exists a 
neighbourhood of x, defined say, by the inequality 0 (a, x) < 6, 
in which |f (a) — f()| <1 or |f(x)| < |f(z)| + 1. On the other 
hand this neighbourhood contains points of the sequence ,, v2, ... 
with subscripts as great as we please; at these points / (x) assumes 
values as large as we please. The contradiction obtained shows 
that f(2) cannot be unbounded, but must in fact be bounded, as 
required. 


THEOREM 2. Hvery continuous function f(x) defined on a compact 
space M, attains on M its exact wpper (and lower) bound. 

Proof. Let 6 be the exact upper bound of f(x). For any integer n 
we can find a point x, satisfying the inequality 


1 
0 <b — f(#,) Pa 


Let us suppose that the function f(x) fails to attain the value d 
anywhere in M. Then the function 


p(x) ~ b— flay 


is continuous on M and is bounded (by theorem 1). But as x runs 
through the sequence of points 2, , the denominator tends to zero. 
Hence the function g(x) cannot be bounded, and our assumption 
must have been untenable. It follows that f(x) attains the value b 
at some point WM. 

The converse results are also true: if a metric space M is not 
compact, there exist continuous functions defined on M and either 
unbounded or, though bounded, not attaining their bounds. One 
of the problems below is devoted to this question. Thus the condi- 
tion “AM is compact” is both necessary and sufficient for the 
validity of theorems 1 and 2. 


THEOREM 3. Every continuous function f(x), defined on a compact 
space M, is uniformly continuous on it, in other words, for any ¢ > 0, 
we can find d > 0 such that o (a, y) < 6 entails |f (x) — fly)| <e. 


METRIC SPACES 57 


Proof. Supposing the contrary to be true, we should be able for 
some € = &€, to find sequences x,, y,, such that 


Olas Yn) <—s [f(m) — flan] 0. 


By hypothesis the sequence 2, contains a subsequence conver- 
gent to the point 2); discarding some subset of the points if neces- 
sary, we can suppose that the sequence 2, itself converges to 2p. 
Then the sequence y, also converges to 7. From some point on, 
Xt, and y, will lie in a neighbourhood of x, which is such that 
[f(x) — f(a»)| < ,/2. But then 


0 €o 
Q Pig: = Fe 


contrary to our construction. The theorem is therefore proved. 


fan) — HYn)| SF (%n) — F(%o) + [fF (%o) — FYn)| < 


3. Conditions for Compactness 


We shall now obtain some conditions that can be conveniently 
applied to establish the compactness of particular metric spaces. 

We can suppose, without loss of generality, that a given metric 
space M is contained in some complete space P, since we can 
always take the completion of M as the space P. 

A set BCP is said to be an e-net for the set MCP if each 
point x of the set M is within a distance ¢ of some point y € B. 


THEOREM I (EF. Hausdorff, 1914). A seé M contained in a metric 
space P is locally compact (in terms of the metric of P) if and only tf, 
for any « > 0, P contains a finite e-net for M. 

Proof. Let M be locally compact and let ¢ > 0 be given; we 
shall show that a finite e-net for M exists. We take an arbitrary 
point x, € M. If all the remaining points of M lie at a distance <e 
from 2,, the point 2, itself constitutes an e-net for M and the 
construction is completed. If there are points of M situated at a 
distance >« from x,, we choose an arbitrary point 2, from among 
them. If now each point of M is within a distance « of either 2, 
or %, the points x,, x, form a finite e-net for M and the construction 
is complete; otherwise we continue the construction. Each new 
point x, that is selected in the process is at a distance exceeding ¢ 
from each of its predecessors 71, 22, -.., Y,_,- Hence if the process 
were to continue indefinitely, we should have an infinite subset 


58 MATHEMATICAL ANALYSIS 


X14, Lg, «+, Up, --- Of M that did not contain a single fundamental 
sequence, in contradiction to the local compactness of M. Since M 
is locally compact, the process must terminate after a finite number 
of steps; but the ending of the process signifies the construction 
of a finite e-net for M. 

Conversely let a finite e-net for I exist in the space P for every 
é > 0; we shall show that M is locally compact. We consider an 
arbitrary infinite subset A < MV, and we have to find a fundamental 
sequence in A. As the first point of this sequence, we take any point 
x) € A. Using the condition of the theorem with ¢« = 1 we can 
cover A with a finite number of spheres of radius 1; amongst these 
there must be one, which we denote by U,, that contains an infinite 
subset A, Cc A. From A, we select any point x, + x). Using the 
condition of the theorem with « = 1/2 we can cover A, with a 
finite number of spheres of radius 1/2; amongst these there is a 
sphere U, that contains an infinite subset A, A,. We choose 
any point 2, € Ag, distinct from 2) and x,. Continuing in a similar 
fashion, we get a chain of infinite subsets A > A, --DA,D > 
(where each set A, is contained in a sphere U, of radius 1/v) and 
a sequence of distinct points x), 2, %, -.., %, -.-, where x, € A,. 
We claim that the sequence 2», x1, %, ... is fundamental. For 
fe<v we have U,>A,>4A4,, so that o(x,,%,) < 2/u. This 
quantity tends to zero as 4 —> oo, and the sequence 2p, 2, ... is 
consequently fundamental, as asserted. 

As an application of this criterion we show that any bounded 
infinite set in n-dimensional Euclidean space P = E,, is locally 
compact. For a sphere in the space P that contains the bounded 
set M contains, for any m, only a finite number of points all the 
coordinates of which have the form k/2™, where & is an integer, 
and the set of these points evidently constitutes, for sufficiently 
large m, an e-net for M. 

We now remark another simple criterion for local compactness: 
a set M in a metric space P is locally compact if, for any ¢ > 0, 
a locally compact set B, (possibly infinite) can be found that is an 
e-net for M. 

The proof of this criterion is very simple. We assert that for a 
given « the finite «/2-net Z for the set B,j2, which exists in virtue 
of the local compactness of B,j, is an e-net for the set M. For 
according to the stipulation, given an arbitrary point 2 € M, there 
exists a point y € Bj. such that e(x, y) S e/2, and a point z EZ 


METRIC SPACES 59 


such that o(y,z) S¢/2, but then o(x,z) So(z, y) + oly,2) Se, 
as asserted. Thus, for any ¢ > 0, M possesses a finite e-net, and 
is therefore locally compact. 


As an application of this criterion, we show that the completion M 
of any locally compact set M is compact. In fact, since the set M 


is dense in M, it constitutes an e-net for M for any e. But M is 
stipulated to be locally compact, and so M is locally compact; 
since M is also complete, it is compact, as required. 


THEOREM 2. A locally compact subset M of a complete metric 
space P is compact if and only if it is closed in P. 

Proof. If the subset M is closed in the complete metric space P, 
it is itself a complete metric space (Section 4, art. 2, example 3). 
If it is in addition, locally compact, it is by definition compact. 
Conversely let M <P be compact; then M is a locally compact 
set and we have only to show that it is closed. This follows from 
the same result of Section 4 (art. 2, example 3) in conjunction with 
the result that, being compact, Mf is a complete space. 

Combining theorems | and 2, we get: 


THEOREM 3. A set M in a complete metric space P is compact if 
and only if it is closed in P and for every « > 0, P possesses a finite 
e-net for M. 

In particular any closed sphere in n-dimensional Euclidean space 
is compact. Every continuous function, defined on such a sphere, 
is bounded and attains its exact upper and lower bounds. 


Problems. 1. Given a compact space Q, find a countable, everywhere dense 
set of points. 
(Hint. Consider the union of all finite 1/m-nets for Q, m = 1, 2, ...) 


2. Show that from any system of open sets {G} that together cover a 
compact space Q it is possible to select a finite subsystem Gj, ..., G,,, that 
also covers Q. 

Hint. If from a given covering of the compact space @ it is impossible to 
extract a finite covering, then it is also impossible to extract a finite covering 
of some sphere Q,, one of finitely many spheres of radius 1/2" that together 
cover Q. Consider the limit point of the set of centres of the spheres Q,,. 


3. Show that the descending chain F, > F, >... of non-empty closed 
subsets of a compact space has a non-empty intersection. 
Hint. Pass over to complements and use problem 2. 


60 MATHEMATICAL ANALYSIS 


4. If a sequence of continuous functions f,(x) <= /,(%) S -- converges on a 
compact space @ to a continuous function /(x) it converges uniformly on 
that space (Dini’s theorem). 

Hint. For a given ¢ > 0 and a fixed point 2, find a number n, such that 
0 S f(x) — f,,(%) < ¢. There exists a neighbourhood of x in which 0 < f(x) 
— fa(%) <3e and hence also 0 < f(a) — f,(v) S3e for all n=). Then 
use the result of problem 2. 


5. An aggregate of continuous functions {/(x)} = A on a compact space Q 
is said to be uniformly equi-continuous if for any ¢ > 0 there exists 6b > 0 
such that the inequality o(x’, x’) <6 implies |f(x’ ) — f(#"’)| < e for any 
f¢€A; it is said to be uniformly bounded if there exists a constant C such 
that |f(z)|<C for any f¢é A. Show that the aggregate A is a compact 
space with respect to the metric e(/, g) = max |f(x) — g(x)| if and only 


ax 

if it is uniformly equi-continuous and uniformly bounded (Arzela’s theorem). 

Hint. Include A in the space of all bounded (even discontinuous) functions 
on Q with the metric g(f g) = sup |/(«) — g(x)|. For a given 6-net on Q the 
set of piecewise-constant functions that assume constant values, equal to 
integral multiples of « not exceeding C in absolute value, on spheres of 
radius 6 with centres at points of the net form a finite 3e-net for A. (To 
eliminate many-valuedness at points common to several spheres, choose 
one possible value at random.) 


6. Show that the functional 


1/2 


1 
F(y) = f y(a) dx — f y(a) da 
0 


12 


is continuous on the space C (0, 1); show that the exact upper bound of its 
values on a closed unit sphere in the space is equal to 1, but that this bound 
is not attained at any point of the sphere. 


7. Given a non-compact metric space M construct on it a continuous 
unbounded function. 

Hint. Hach point x, of a sequence 21, 2%, ... from which a convergent 
subsequence cannot be extracted is situated at a positive distance r, from 
the set of all remaining points of the sequence. Every function f(x), con- 
tinuous on each of the spheres {9(2, x,) S 1/2 r,}, is equal to zero on their 
boundaries, and outside the spheres is continuous on VW. 

8. Given a reflection of a compact space into itself which satisfies the 
condition 9(A x, Ay) < e(,y) for x+y, show that it possesses a unique 
fixed point. 

Hint. The minimum of the continuous function @(A x, x) cannot be 
positive. 

9. Show that a compact space cannot be reflected isometrically onto a 
part of itself (V. A. Rokhlin). 

Hint. Supposing the contrary to be true, it is possible to find a point at a 
positive distance ry from the image of the space. Reiterating the reflection, 
construct a sequence in which the mutual distances between points are 
never less than 7. 


METRIC SPACES 61 


10. Construct a locally compact set on the plane which is isometric with 
a part of itself. 

Hint. Consider the set of points with polar coordinates 9 = 1, p = 0, 1, 2, ... 

11. Let A, B be isometric reflections of the compact space Q into itself. 
We define the distance between A and B by the formula 


0(A, B) = max e(A a, Ba). (1) 
rEQ 


Show that the set of all isometric reflections of Q into itself, metricised by (1), 
is compact (B. L. Van der Waerden). 

Hint. For a given finite e-net on Q, observing that there exist only finitely 
many reflections of a finite set onto itself, construct a finite 2e-net in the 
space of isometric reflections. 


4, Functions of Several Variables 


We shall sometimes come across continuous functions with 
several arguments that vary over a metric space M. For the sake 
of definiteness let us consider a function of a pair of points, y, z. 
A real-valued function f(y, z) the arguments of which belong to 
a metric space M, is said to be continuous at y = Yo, 2 = 2p, if 
for any ¢ > 0 there exists 6 > 0 such that the inequalities 


oly, Yo) < 6, 0 (Z, Zo) < ”) (2) 


If(y, 2) — f(Yo. %0)| <«- (3) 


imply 


A function continuous over all pairs y9, 2, is said to be continuous 
everywhere. 

An example of a continuous function of a pair of points y, z is 
the distance-function g(y,z). For by the quadrangle inequality 
(Section 1, art. 2). 


le(y, 2) — @(Yo, %o)| SEY, Yo) + e(@, 2), (4) 


which can be made less than a given e > 0 by taking o(y, y9) < /2, 
0(2, 29) < &/2. 

There is no need to reiterate the whole theory, expounded in 
paragraphs 1-2, for functions of several variables. In fact every 
function of several variables which vary over a metric space M 
can be represented as a function of a single variable which varies 
over some new metric space M’. For the sake of simplicity we shall 
consider the case of two variables, but we first define the product 
of metric spaces. 


MA. 3a 


62 MATHEMATICAL ANALYSIS 


Let metric spaces M, N be given. We consider the set of all 
possible formal pairs {x,y}, where « € M, y € N, and define the 
distance between such pairs by the formula 


0({%1, Yi}> {2 Yo}) = O(%1, 2) + O(%r> Ye); (5) 


where 0(%,, 2%), 0(Y1, Y2) Aenote distances in the spaces M, N, 
respectively. 

It is easily verified that the distance defined by this rule satisfies 
the axioms of Section 1. The set of all pairs {x, y} together with 
this metric is said to be the product of the metric spaces M, N; it is 
denoted by M x N. 

Let us consider a function f(z, y), the first argument of which 
runs through the space M, the second through the space N. It can 
evidently be regarded as a function of a single argument which 
runs through the space M x N. In particular, a function f(z, y), 
the arguments of which run through one and the same space I/, 
ean be regarded as a function of a single argument which runs 
through the space M x M. Ii f(x, y) is a continuous function of 
the arguments 2, y in the sense indicated above, the corresponding 
function on the space M x M will evidently be continuous in the 
usual sense (art. 1). Thus the theory of continuous functions of 
two variables reduces to the theory of continuous functions of a 
single variable. 


Problems. 1. Let the spaces M, N, be locally compact; show that their 
product M x N is also locally compact. 

2. Let the spaces M, N be complete; show that their product M x N is 
also complete. 

3. Give an example where a space M is locally compact and a space NV 
is complete, but the product M x N is neither locally compact nor com- 
plete. 

4. If metric spaces M, N are infinite, there exists a function f(z, y), con- 
tinuous in each of its arguments separately (with the other argument fixed), 
but not continuous on the product space M x N. 


8. Normep LingaR SPACES 
1. Linear Spaces 
In analysis the operation of passing to the limit is most fre- 
quently encountered in combination with other operations, among 
which linear operations—the addition of elements and their multi- 
plication by numbers—are greatly in evidence. These operations 


METRIC SPACES 63 


themselves are studied in linear algebra. We recall a fundamental 
definition connected with them—the definition of a linear space. 
A set E of elements z, y, ... is said to be a linear space if there are 
defined on it operations of addition and multiplication by (real or 
complex) numbers, that satisfy the eight axioms to be listed below: 
The first group of axioms (1-4) describes the properties of addi- 
tion: 


laty=y + x (commutativity of addition) 
2(@+y)+tz2=a 4+ (y + 2) (associativity of addition) 
3. There exists an element 0 such that x +0 = x for any x CH. 


4. For each x € E the equation « + y = 0 has a solution. The 
element y is said to be the inverse of the element x. 


It is easily verified that the elements 0 and y, the existences of 
which are required by axioms 3 and 4, are determined uniquely. 

The next group of axioms (5-8) provides a link between the 
operations of addition and multiplication by numbers (scalars). 
We denote by & the set of (real or complex) numbers by which 
multiplication is admissible. 


BA(ure) = (Apa ACC, UE, x EE) 
6.lx =a 

Tala ty) sau thy 

8 (At pa sAuvt+pnx 


It can be shown that 0-x = 0 and that the inverse of a given x 
is the element y obtained by multiplying y by — 1. 

A set Z of elements of a linear space # is said to be a subspace 
of # if the operations on elements of Z of addition and multipli- 
cation by numbers always produce elements of L. The smallest 
subspace consists of the single element 0, the largest is the whole 
space Z. 

An example of a linear space is the aggregate R,, of arrays of n 
numbers 


t= (€1, sae 9 En) (1) 


+ A detailed account can be found, for example, in: G. Ye. Shilov, Intro- 
duction to the theory of linear spaces, State Technical Publishing House, 
1956 (2 issues), Chapter II on. English translation An Introduction to the 
Theory of Linear Spaces (Prentice-Hall, London, 1961). 


3a* 


64 MATHEMATICAL ANALYSIS 


with operations effected ‘‘ coordinate by coordinate”: ifaw = (&, ..., 
E,), ¥ = (Ny, ++) Mn), we define 
xty=(& +, wy En +n); (2) 
au = (%&,, ..,08,). 


This space is said to be n-dimensional, i.e. between any n + 1 


elements 7, ..., a+, there exists a dependence relation 
n+l 
O12) +e + Ong e+) = 0, Dd) OF >0 (3) 
j=l 
and there exists also a set of m elements e™, ..., e for which no 


such dependence relation holds. 

The functional spaces C(a, 6), D, (a, 6), C,(a, 6) (Section 1) are 
also linear spaces (with the natural operations). In contrast to R,, 
these spaces are infinite-dimensional. 

We recall further the important concept of an isomorphism 
between linear spaces. Two linear spaces, H’ and E£”’, are said to 
be isomorphic if there exists between them a one-one correspond- 
ence that preserves the operations of addition and scalar multi- 
plication; this means that if a vector x’€ H’ corresponds to a 
vector #’€ E", and y'€ EL’ corresponds to y’€ £”, then the 
vector 2’+- y'€ EH’ corresponds to the vector 2''+ y''€ HE”, and 
for any AC @ the vector Ax’ € EH’ corresponds to the vector 
Ag! € E". 

Thus any two n-dimensional spaces are isomorphic; each of 
them is isomorphic with the n-dimensional coordinate space de- 
scribed above. 


2. Normed Linear Spaces 


The properties of a linear space are combined with the metric 
properties, to which we have so far devoted this chapter, in the 
definition of a normed linear space (a complete normed linear space 
is called a Banach space): 

A set # of elements x, y, ... is said to be a normed linear space if: 

(1) it is a linear space; 

(2) it is a metric space, and one in which 

(3) the distance-function o(x, y) and the linear operations are 
connected by the following conditions: 


(a) distance is invariant under displacement, i.e. 


e(«@+2z,y+2z)=e(z,y) forany 2z,y,2; 


METRIC SPACES 65 


(b) distance is subject to the “homogeneity condition”: 
e(Ax, 0) = |Ale(z, 0) 


for any number / and any element z. 
From axiom (a) it follows that the distance between two points 
is equal to the distance of their difference from zero: 


o(%, y)=e(@—y, y— y) = E(t, 0). 


It is therefore sufficient to know the distance of any element from 
zero. The distance @ (x, 0) is said to be the norm (or length) of the 
element x and is denoted by ||| or |x|. From the metric axioms 
and the properties (3) it is easy to deduce that the norm of any 
element satisfies the conditions: 


(x) | a > 0 for «+0, |0| =0; 
(B) |A al] = |Aj- als 
(le +y] slell +iyl- 
Conversely, if a norm is defined on some linear space £, i.e. each 


element x determines a number ||x|| such that conditions («) — (y) 
are satisfied, a metric on HZ can be introduced by the formula 
o(a, y) = |x — y| 

and # becomes a normed linear space. In general a linear space 
with a norm that satisfies conditions («) — (y) is also said to be a 
normed linear space. 

The metric spaces C(a, 6), D (a, b), C(a, b) (p = 1, 2) considered 
in Section 1 of this chapter, are normed linear spaces. 

For in the space C(a, 6) of continuous functions on the closed 
interval a <x <b, distance was given by the formula 


o(y, 2) = max |y(x) — 2(x)|. 
asrvsb 


Axiom (3) is evidently satisfied here. The norm of an element y is 
defined by the formula 


yl = e(y, 0) = max |y(z)]. 
aszsb 


In the space D (a, 6) of functions with continuous derivatives up 
to order m on the closed interval [a, b], distance was given by the 
formula 


o(y, 2) = max {ly(x) — z(a)|, .., [yo (x) — 2 (x)|}. 
asasb 


66 MATHEMATICAL ANALYSIS 


It is easily seen that both the properties required in axiom (3) 
are satisfied here. The norm of an element y is defined by 
yl = max {ly(z)], ly @)], > ly @II- 
asr<sb 
In the space C,(a, b) (p = 1, 2) of continuous functions on [a, 5] 
with distance given by 


b 
oP y; 2) =| Ly (x) _ 2(x)|P dx 


the two requirements of axiom (3) are again satisfied (the second 
because we have g? and not 9 on the left). The norm of an element 
is defined by 


b 


bul? = fly@lrde. 


The set determined by the inequality 
iv si, 


is a sphere of radius 1 with centre at the zero of the space # in 
question. It is said to be the unit sphere of the space EF. 
The unit sphere (as also any sphere in a normed linear space) is 
a convex set. In general a set M in a linear space £ is said to be 
convex if it contains, together with any two points z, y, all the 
points 
z=axn+BPy, «ot+fP=1, « 20, B =O, 


or, in geometrical terms, contains the closed interval with end- 
points «, y. The convexity of the unit sphere follows immediately 
from the triangle inequality: if |x] <1, jy] <1, then 


lz = law + Bylsolel+ Bly|so +B =I. 


We remark some general properties of normed linear spaces 
connected with the concept of convergence. Since the distance be- 
tween the points x, y is defined as | 2 — y||, the convergence of a 
sequence 2, to the element x is incorporated in the relation 

lim |e — x,|| = 0. 
n> OO 
Tf x, > &, Yn > y, then &%, + Yn, > xu + y; for 


|e t- 4 = (te + Yn)l| = Ve — an) ~ (Y — Yn)! 
S |x ~ a) + ly — wl > 0. 


tod 


METRIC SPACES 67 


If x, > &, Ay > A, then Ay %, > Ax; for 


An Ly — A2|| = || An (%n uae x) 7 (An a A) x|| 
S [An| lt — 2 + [An — Al ie >. 


In conclusion we shall say a few words about linear isometries 
between normed linear spaces. Two isomorphic linear spaces can 
have distinct norms, as, for example, C (a, b) and CO, (a, 6). If norm- 
ed linear spaces are isomorphic as linear spaces and isometric as 
metric spaces, they are said to be linearly isometric. For example, 
the spaces C(0, 1), C(0, 2) and C,(0, 1), C,(0, 2) are linearly iso- 
metric. The required one-one correspondences can be given by the 
formulae 


C(0, 1) 3 g(x) <> y (2x) E C(0, 2); 
C,(0, 1) 3 g(a) <> 24? y (2x) EC, (0, 2). 


Instead of the precise term “linear isometry’’, the looser but brief 
“isomorphism” is often employed. 


3. Completion of a Normed Linear Space 


Like any other metric space, a normed linear space can be either 
complete or incomplete. In the latter case the space EZ can be com- 
pleted by including it in a larger complete metric space Z, as in 
Section 7. It is to be observed that the completion of a normed 
linear space is not only a metric space but also a normed linear 
space. To establish this we must introduce linear operations into 
the completion and verify that axioms (1) and (3) are satisfied. 

Each element X of the completion of a metric space E was 
defined by us as a symbol corresponding to a class of equivalent 
fundamental sequences of L. 

We suppose now that # is a normed linear space. Then if we add 
two fundamental sequences 21, %g, «--, My, --- ANA Yy, Yoy oy Yay 
term by term, we get the sequence 


vy F > Xo oF Yas o> Xn + Yn> eoeg 
which is also fundamental, since 
com + Yn) oo (Xm + Ym)|| = lan = Lm + Yn a Ymni\ + 


Substituting the equivalent sequences {2’,}, {y,} for the sequen- 
ces {x,}, {yn} we obtain the sequence of sums {x,, + y;,}. which is 


68 MATHEMATICAL ANALYSIS 
equivalent with the constructed sequence {x, + y,}, since 


This fact permits the introduction of linear operations over the 


elements of the space E as follows. 

We choose fundamental sequences {x,{, {y,} from the classes 
X, Y respectively and consider as the sum of X and Y that class 
which contains the fundamental sequence {x, + yn}. 

The preceding discussion affirms the correctness of this identi- 
fication, and, in particular, the non-dependence of the result on the 
choice of the sequences {x,}, {y,} in X, Y. 

The product of the class X with a number A is similarly defined 
by choosing a fundamental sequence {z,} in X and understanding 
as the class A X that class which contains the fundamental se- 
quence {A z,}. We leave it to the reader to substantiate the cor- 
rectness of this identification. 

It is easily verified that the axioms 1-8 for a linear space are 
satisfied here; by their very definition the linear operations on 
classes reduce to the corresponding operations on elements of the 
original space. In particular, the class 0 consists of all sequences 
of the space # that converge to zero. 

It only remains for us to verify that axiom 3 for a normed 
linear space is satisfied. The distance between classes X, Y is 
defined in Section 7 by 


o(X, Y) = lim O(n; Yn)» 
n> oo 


where {z,}, {y,} are any fundamental sequences in the classes X, Y, 
respectively. In particular, fixing the fundamental sequences 
{an}, {Yn}, {Zn}, in the classes X, Y, Z, we have 


o(X + Z, Y + Z) = lim Q(t + Zn, Yn + 2n) 
m-> oo 
= lim o(%, Yn) = e(X, Y), 
nun >o 
so that axiom (3a) is satisfied. Similarly 


(4X, 0) = lim g(Aa,, 0) = [4] lim e(2,, 0) = A] o(X, 0), 


and so axiom (3b) is satisfied. Our assertion is therefore fully af- 
firmed. 


METRIC SPACES 69 


4. Factor-Space 


Let LE be a subspace of the linear space R (with no metric as 
yet). Two elements x, y are said to be equivalent relative to L if 
their difference x — y belongs to L. If x, y are each equivalent 
individually to a third element z, they are equivalent to each 
other; for 


e—y=(u—2)—-(y—2 EL. 


Thus the whole space # can be partitioned into classes of mutually 
equivalent elements, i.e. x, y fall into the same class if and only 
if they are equivalent. The subspace ZL itself forms one of the 
classes; the class containing the element 2, is the totality of sums 
XL +1, where J runs over all LZ. We shall denote classes of equi- 
valent elements by X, Y, ... 

We show that linear operations can be introduced into the set 
of classes X, Y, ... To define the sum of two classes X, Y, we 
take arbitrary elements x, y in these classes; the sum z= 2% + y 
belongs to some class Z, which we shall regard as the sum, by 
definition, of classes X, Y. Given X, Y, this definition is unique: 
if x is replaced by an equivalent element « = # + 1,1E€L, and y 
by an equivalent element y=y+l’, l’ CL, the sum «+ y is 
replaced by 2’ +y’ = (x+y) + (14+), equivalent to x+y. 
The product of a class X by a number « is similarly defined: the 
class « X consists of all elements equivalent to the element « x, 
where x is any fixed element of the class X. All the axioms 1-8 
of a linear space (art. 1) are satisfied here automatically, since 
they reduce to the corresponding axioms for the elements of R. In 
particular, the class L is the zero of the space of classes. 

The linear space of classes constructed thus is said to be the 
factor-space of space R with respect to the subspace LZ and is 
denoted by R/L. 

Now let R be a normed linear space and Z a closed subspace of 
R. We can now define a norm on the factor-space R/Z by putting 


|X] = inf |e. 
wEXx 


Let us verify that the norm axioms are satisfied. 

(a) Obviously || Z|| = 0 since 0 € LZ. We show that | X] > 0 if 
X +L. If || X|| = 0, the class X contains a subsequence 2, for 
which ||,|| > 0 as n > oo! If x is any element of the class X, 


70 MATHEMATICAL ANALYSIS 


a —2, —1,¢€L. But « — 2, > 2, and JZ is closed; hence a € L, 
X = L, which contradicts the hypothesis. 


(b) JAX] = inf Jai = inf [Ax] = [2| inf [2] = [A] XI, 


(ce) |X+¥[ = inf |z| Ss inf jetyl Ss int {lel + 
sEX4+Y  2€X,yeY 2€X, yeY 


+ lyi} = inflel + int yi = (Xi +171. 


Thus R/L is here a normed linear space. 

Now let & be also complete; then R/L is also a complete space. 
We prove this by considering a fundamental sequence of classes 
X,, X,, -.., Xp, -.. We find for a given k a number n,, such that 
|Xnptm — Xn, < 1/2* for any m =1,2,...; in particular, 
| Xn,+1 — Xn,| < 1/2". We choose an arbitrary element x, of 
the class X,,; since the class X,, — X,, consists of all the dif- 
ferences x — x,, where x runs over the class X,,, we can find an 
element x, € X,, such that ||2. -- x,|| <1; and in the same way, 
we find an element x3 € X,, such that |; — || < 1/2, and so 
on; the element x,,, belongs to the class X,,,,, and || a... — 2%! 
< 1/2*-1, The sequence 2; is evidently fundamental and, since R 
is complete, converges to some element x. Let X be the class 
containing x; we have 


|X — X,,|| S||e#—x%|>0 as ko oo. 


The sequence of classes X,, is therefore convergent to the class 
X. But X, is a fundamental sequence, so that it converges as a 
whole to the class X, as required. 


Examples. 1. Let E be a linear space on which a seminorm is 
defined. This means that every element x € E corresponds to a 
number ||z/| such that the norm axioms (jf), (y) are satisfied, (i.e. 
||A || == [A] flv] and ||a + y|| S ||l| + |ly|l), but («)is not satisfied, i.e. 
there are elements z+ 0 for which ||x|| = 0. Let us show how the 
space # can be “‘improved”’, i.e. transformed to a new space in 
which all three axioms (a), (8), (y) are satisfied. By virtue of (). 
(y), the totality of elements x with ||z||=0 forms a subspace 
ZEEE. We form the factor-space R/Z; as we know, its elements 
are the classes of elements equivalent relative to Z. All the ele- 
ments of a class X have the same seminorm, since 


Je +0) <fal + Uh = leh. 
le] Sfx 40 —a) <x sal + fe] = [2+ 


METRIC SPACES 71 


We define the norm of X as the common value of the seminorms 
of all its elements. Axioms (f), (y) are satisfied by construction, 
since they are satisfied for the elements x. Axiom (a) is also 
satisfied; the norm of class L is zero, and if || X|j = 0, all the ele- 
ments «€ X have zero norm, so that X coincides with Z. Thus a 
“‘genuine”’ normed space is obtained by forming the factor-space 
E/Z of E# with respect to the subspace Z of elements with zero 
seminorm. 


2. The completion of a normed linear space H (art.3) can be 
interpreted as the formation of the factor-space of the space FR of 
all fundamental sequences of elements of # with respect to the 
subspace Z of sequences convergent to zero. 

The seminorm of space R is in fact 


| {an}|| = lim |p). 


Elements with zero seminorm correspond to sequences conver- 
gent to zero. The classes of equivalent elements are the classes 
of coterminous fundamental sequences, and R/Z is the totality 
of these classes, normed in accordance with the rule indicated in 
example 1. 


Problems. 1. Show that the triangle axiom in the definition of a normed 
linear space can be replaced by the convexity condition for the unit sphere. 


2, An arbitrary, centrally symmetric, closed, convex set Q is taken on the 
plane with the origin of coordinates as an interior point. Show that there 
exists a norm for which Q is a unit sphere. 


3. Let R be an n-dimensional normed space. Prove that the sequence 
Xm = (EM, ..., E™) (m = 1, 2, ...) converges to zero if and only if each of 
the coordinates é”” tends to zero as m > ©. 


Hint. Since ||z|| S$ 2 [&;| |[¢|| where e, = (0, 0, ..., 1, ...,0) the 1 being 
jel 


in the jth position, convergence by coordinates implies convergence in the 
norm. To prove the converse it is enough to consider the sphere |x| =< 1 
and show that the coordinates of all its points are bounded by a fixed con- 


stant. But if for some sequence z,,, || 2, || = 1, we have max; |&™| = ¢,, > ©, 


x, ; ' 
then y,, = rae 0. At the same time, among the coordinates of each of 


the vectors y,., there is one at least equal in absolute value to 1. It is possible 
to extract from the sequence y,, a subsequence that converges in all its 
coordinates, and hence also in the norm, to some vector y +0, which 
contradicts the given relation y,, > 0. 

4, Prove that a finite-dimensional subspace of a normed space FR is always 
closed in R. 

Hint. Use problem 3. 


72 MATHEMATICAL ANALYSIS 


5. Let L be a closed subspace of a normed space R, not the whole of FR. 
Show that the unit sphere of R contains a vector y the distance of which 
from all vectors of L exceeds 1/2. 

Hint. Let yo € R — L and d= inf |y) — x]; further let y, ¢ L be found 

x€L 


Yo~ 1 satisfies the con- 


such that |y, — y,| < 2d. Then the vector y = [ye= al 
oT 9. 


dition. 
6. The constant 1/2 in problem 5 can be replaced by any constant smaller 


than 1, but cannot, in general, be replaced by 1. Consider as an example the 
subspace L of the space C(— 1, 1) composed of the functions g(x) for which 


0 1 
f p(z)dx= f p(a)da. 
6 


-1 


7. An infinite-dimensional normed linear space always contains a bounded, 
infinite set the elements of which are at mutual distances > 1/2 (F. Riesz). 

Hint. Use problem 5. 

Note. The result of this problem in conjunction with the results of 
Section 7 shows that the compactness of its bounded sets is a necessary and 
sufficient condition for the finite-dimensionality of a normed space. 


8. F is a closed set in a complete linear normed space # and contains a 
closed interval 0 StS, ¢ = (x9), on each ray fa, 0St< o. Show 
that F contains a sphere (I. M. Gelfand). 


ao 
Hint. E = [) nF; use problem 2 of Section 4, art. 5. 
n=l 
9. (continued). If the closed space F of problem 8 is centrally symmetric 
and convex, it contains a sphere centre O. 
Note. Without the convexity condition the theorem is false even on a 
plane. 


9. LINEAR AND QuaDRaTIC FUNCTIONS ON A LINEAR SPACE 


The simplest functions defined on normed linear spaces are 
linear functions. 

A linear function can be defined on an unmetricised linear space 
as follows: a function f(x), defined on a linear space £, is said to 
be linear (a linear functional) if it satisfies the conditions: 

1° f(a + y) = f(x) + fly) for any z,y € E; 

2° f(A aw) = Af (x) for any x € # and any number 4. 

By way of example we shall obtain the general form of a linear 
functional on an n-dimensional space HE. Let ¢,, é:, ..., &, be a 
basis for the space #; for a given linear functional f(x), we put 


(ex) = yy very 1 (en) = On. 


METRIC SPACES 713 


n 
Then for any vector x = 3’ §; e; € # we shall have 
jel 


je) = (3 be) = 3G He) = 3 a8. 


j=l 
Thus any linear functional on an n-dimensional space can be ex- 


pressed relative to any basis as a linear function of the coordinates 
of the vector x. 


Problems. 1. If f(x) is a linear functional on a linear space Z, the equation 
{(x) = 0 determines a subspace EF, of #. Show that the factor space H/H, 
(Section 8, art. 4) is one-dimensional. 

Hint. The class X of elements equivalent with respect to H, is a set on 
which /(x) maintains a constant value. The map «> f(x) determines an 
isomorphism between the space H/H, and a one-dimensional space. 

2. Linear functionals f,(z), ..., /,(a) are said to be linearly independent if 
from C, f,(z) + + + C, f(x) = 0 it follows that C, = -. = C, = 0. Let £, 
be the subspace of a space H determined by the equations /,(x) = 0, «-, f,(x) 
== 0. Show that the factor space H/H, is n-dimensional if /,, ...,/, are 
linearly independent. 

Hint. On each class X of elements equivalent with respect to #, the 
functionals /,(x), ..../,() are constant. The map x {f,(2), ..., f,(z)} of 
the space H into the n-dimensional space R#,, determines an isomorphism 
between the factor space E/E, and some subspace R' < R,, which has, 
say, &-dimensions. If we had k < n, there would exist a linear dependence 
relation C, f,(z) + -- + C, f,(a) =0; hence k = n, R' = R,. 

3. Let f,, ...f,, be linearly independent functionals on a linear space H 
and let #, be the subspace described in problem 2. Show that any linear 
functional g that reduces to zero on #, is a linear combination of the func- 
tionals f,, ..., /, $0 that 


g=Aht+ wit Ay tae 


Hint. g can be defined on the classes of elements equivalent with respect 
to #,, and therefore on the space H/H,,. Then use the general form of a linear 
functional on a finite-dimensional space. 

4, Given any x linearly independent functionals /,, ..., /,, find n elements 
24, +, %, such that det || /,(x,)|| + 0. 

Hint, Choose them one by one at random from the 7 linearly independent 
classes of the n-dimensional space H/E,, (problem 2). 

We observe that a linear function of the coordinates of a vector x 
in an n-dimensional space ZH is clearly a continuous function of x 
in the sense of Section 7. Hence a linear functional on a finite- 
dimensional space is always a continuous function. 

On infinite-dimensional spaces there exist both continuous 
linear functionals, and those which are not continuous. Our dis- 
cussion will be confined to the class of continuous linear functionals. 


74 MATHEMATICAL ANALYSIS 


We give two important examples of continuous linear functionals 
on the space C(a, 6) of all continuous functions on the closed 
interval [a, b]: 

(a) F(y) = y(&p), i.e. the functional # maps each point of the space C(a, b) 


i.e. each continuous function y(x), onto its value at a fixed point 2, of the 
closed interval [a, 5}. 
b 


(b) F(y) = f #8) y(&) dé, where /(é) is a fixed continuous function of &. 


We leave it to the reader to verify that these functionals have the required 
properties of continuity and linearity. 

Notice that the linear functional F(y) = y(x,) is no longer continuous 
on space C,,(a, b)f. 


We now cite some general properties of continuous linear func- 
tionals on a normed space. 


Lemma 1. A linear functional f (x), continuous at the point x = 0, 
is bounded (in absolute value) on any sphere ||x|| Sr. 

For it follows from the continuity of the functional f(x) at the 
point « = 0 that there exists a sphere U = {z: ||2|| < 6} on which 
the values of the functional / (x) are bounded by a given number e. 
Tf now | 2] <r, then (6/r) x € U, and hence 


(ta) = 


But since the functional f(z) is a linear, f[(6/r) z] = d6/rf(x), so 
that for |v] <r 


f@i s 


as required. 


Lemna 2. If a certain linear functional } (x) is bounded on the unit 
sphere ||x| <1 so that, say, |f(x)| SK on this sphere, then for 
any x the inequality 

[f(@)| = K] a} 


is satisfied, and the functional f (x) is continuous on the whole space E. 
+ The functional F(y) is defined on an incomplete space in this example. 


For an example with a complete space, see the Supplement at the end of 
this book. 


METRIC SPACES 75 


For given any 2, the ratio z/| || lies in the unit sphere; hence, 
by the boundedness condition, 


x 1 
i(qepl<qgp@l Ss sls Xe}, 


proving the first assertion of the lemma. The second assertion 
follows from the inequality 


|f (a) — f(xo)| = |f(@ — %)|S K|a — aI. 


As a consequence of lemmas 1 and 2 we obtain the result that 
the continuity of a linear functional at the single point x = 0 implies 
its continuity everywhere. The exact upper bound of |f(x)| on the 
unit sphere is called the norm of the functional f and is written as 
| fl. By lemma 2, 


f(a)| SUF - Tal 
for any x € E. 


Problem. A Linear functional {(z) is bounded on the sphere ||x — xq|| <r. 
Show that it is continuous on the whole space H. 

We often have to deal with linear functions of several arguments: 
bilinear, trilinear, etc. We now define a bilinear functional. A 
function f(y, 2) of a pair of arguments y,z, which vary over a 
linear space is said to be a bilinear functional if it is a linear func- 
tional of z for each fixed y and a linear functional of y for each 
fixed z. 

We can easily obtain the general form of a bilinear functional on 
an n-dimensional space. Let ¢,, €2, ..., &, be a basis of the space 
and let f(y, z) be a given bilinear functional. We determine n? 
numbers fj, (j,k = 1, 2, ..., ) by the equations 


Bix — fe, ex) 


Now let y = 5/7; e; and z = >} 6, e, be two arbitrary vectors. We 
j=l k=l 
evaluate f(y, z) 


fy, 2) = 
J 
i 


(3, se 2. $i «) 


2) 


n 
selk=1 


3 Si Hes a= 5 5 Bj 05 Si (1) 
J=l1lk=1 


76 MATHEMATICAL ANALYSIS 


i.e. for the basis {e;} the functional f(y, z) is expressed as a bilinear 
form in the coordinates of the vectors y, z. 

In the infinite-dimensional case we shall consider bilinear func- 
tionals under additional assumptions of continuity. We recall that 
a function /(y, 2) is said to be continuous, say for y = 0, z = 0, if 
for any ¢ > 0 there exists 6 > 0 such that for ||z]| < 6, | y| <6, 
we have 


[fty, z)| <é. 


From this condition for a continuous bilinear functional we can 
deduce the following important estimate: for any y, z 


Ify, 2)| = Cllyl - [lal (2) 
where C is a fixed constant. For in virtue of the continuity of 


fly, 2) at y = 0, z = 0, there exists a sphere, of radius 6, say, on 
which | f(y, z)| does not exceed a given quantity, say 1. But then 


for any y, z 
\e-igr Par)| = 


and consequently 
21 Sy ll lel. 8) 


A function of y which is obtained by substituting y for the argu- 
ment z in a bilinear functional f(y, z), ic. a function f(y, y), is 
said to be a quadratic functional. It follows from (1) that a quadratic 
functional on an n-dimensional space can always be expressed in 
the form 


fyy) = 5° 3S Bien ms (4) 
jJ=lk=1 


i.e. as a quadratic form in the coordinates of the vector y. In the 
general case a quadratic functional satisfies the inequality 


f(y y)| <Clyll?, (5) 


which is obtained by substituting y for z in (2). 
Trilinear functionals, cubic functionals, etc. can be constructed 
similarly. , 


METRIC SPACES 717 


Concluding remark 


The emergence at the beginning of the twentieth century of the 
theory of metric spaces and its profound suffusion of mathematical 
analysis were anticipated by the whole preceding development of 
analysis. The fundamental concepts of the theory, including com- 
pleteness, compactness, and separability were formulated in 1906 
by M. Fréchet (French mathematician, b. 1878). The general de- 
finition of a normed linear space and of linear functionals on it 
was introduced in 1922 by Stephan Banach (Polish mathematician, 
1892-1945), and Norbert Wiener (American Mathematician, 
1894-1964). For further study we can recommend Hausdorff’s 
Theory of Sets and Banach’s Course of Functional Analysis 
(Kiev, 1948). 


CHAPTER III 


THE CALCULUS OF VARIATIONS 


OnE of the central problems in the classical analysis of functions 
of one or several variables was that of discovering the extrema of 
differentiable functions. Extremum problems also play an im- 
portant part in functional spaces, which we began to study in 
Chapter II. For example, the problem of determining the minimal 
surface of revolution bounded by two circles with a common axis 
(Fig. 4) can be interpreted as the problem of finding the extremum 
of the function 


b ne 
Fly) = 2x fyVi + y2de, 


the argument y = y(x) of which is itself an element of a normed 
linear space (for example, the space D,(a,b) in the case con- 
sidered). 

The calculus of variations has as its aim a generalisation of the 
structures of classical analysis which will make possible the solu- 
tion of similar extremum problems. More widely conceived, the 
calculus of variations is the analysis of infinitesimals (differential 
calculus) in infinite-dimensional spaces. 

78 


THE CALCULUS OF VARIATIONS 79 
]. DIFFERENTIABLE FUNCTIONALS 


1. We first recall the definition of a differentiable function in 
classical analysis. 

lf a function F(x) = F(x,, 2, ...,%,) has continuous deri- 
vatives (in the usual sense) with respect to each of the arguments 
Hy, .+, %,, then its increment from the point x = (a, ..., Z,) to 
the point +h = (a, + hy, ..., % + h,) can be expressed in the 
form 


AP(c) = F(a +h) — F(x) = a OE ths 


joi 02; : 
_ & 0F(%) OF (x; + 0 h;) OF (2;) 
Seg Al dx; an; 
es et hy +r(v,h) (0<0 <1). (1) 
j=1 


The first term on the right-hand side is the total differential of the 
function F; it represents an expression linearly dependent on the 
components of the displacement vector h. In virtue of the con- 
tinuity of the partial derivatives (0 F)/(02;) the second term is an 
infinitesimal of higher order than |h|; this denotes that for any 
é > 0 there can be found 6 > 0 such that, for |A| < 6, the ine- 
quality 
Ir(a, h)| Selhl 


holds. If F(x) possesses second-order derivatives 0? F/0x; 02, 
(i,k = 1, 2, ..., m) in a neighbourhood of the point x, the quantity 
r(x, kh) is expressible in the form 


2 F (a, + 0 hy, ay +O hg, 5 Xp +0 hy) 


02; OX, hike 


r(t,h) = 5 


(0<@<1), 


and if these second-order derivatives are bounded in the neigh- 
bourhood by a number JN, a bound for r(x, 2) is given by 


|r(z, h)| SN lh), (1’) 


which shows that r(x, h) is then an infinitesimal of order not less 
than |h|?. 


80 MATHEMATICAL ANALYSIS 


Thus in the general case the first component in the sum (1) is 
predominant for sufficiently small displacements h; it is therefore 
said to be the principal linear part of the increment of the function 
F(x). The general definition of a differentiable function on the 
n-dimensional space R,, is based on these properties. To wit, a 
function F(x), defined on a set S in the space R,,, is said to be 
differentiable at the point x) € S if its increment between 2, and the 
point z, + hk € S can be expressed in the form 


AF (x) = F(x, + h) — Flay) = L(2q, h) + (a, h), 


where L(x, h) is a linear function of the displacement h, and 
r(%9, h) is an infinitesimal of higher order than # in the sense 
explained above. 

2. This definition carries over to the case of a function defined 
on a normed linear space: a functional F(y), defined on a normed 
linear space H, is said to be differentiable at the point y, € E if its 
increment between y, and the point y, + / can be expressed in 
the form 


AF ly) = Flyy + h) — Fly) = L(y, h) + (yo, 4), 


where L(y, 2) is a continuous linear functional of the displacement 
h, and r(yo, 2) is an infinitesimal of higher order than h; this 
denotes that for any ¢ > 0, 6 > 0 can be found, such that for 
o(0, h) < 6 we have [r(yq, h}| < el A]. 

We observe that there can only be one continuous linear func- 
tional L(yp, h) satisfying the conditions put forward. For if we 
had 

AF ly) a L(Y; h) + (Yo h) -, L(Y, h) + 72 (Yo, h), 


then by subtraction we should get 
Ly(Y; h) — L,(yo, h) = r2(Yo, kh) — r1(Yo, 2) = r (Yo, h), 


where r(y),h) is again an infinitesimal of higher order than h. 
The difference L, (yg, h) — Le(yo, h) = L(yo, h) represents a new 
continuous linear functional of h. For a given « > 0, we can find 
6 > 0, such that for ||h| < 6, the inequality 


Ir (Yo. h)| — |L(y; h)| <eé | AI) 


holds. Dividing through this last inequality by |||, we find that 
on the unit sphere |/#|| <1 the values of the linear functional 


THE CALCULUS OF VARIATIONS 81 


L (yo, 2) do not exceed ¢, i.e. in absolute value. But since ¢ can be 
chosen arbitrarily small, L(y), hk) =0, and L, (yo, h) = Le(yp, h), 
as required. 

The linear functional L(y), 2), which we have just shown to be 
uniquely determined, is said to be the differential or, more com- 
monly, the variation of the functional F at the point y, and is denoted 
by dF (yp, h). 

It is well-known that a differentiable function F(x,, ..., 2) of 
n variables has a derivative in every direction. This property 
carries over to differentiable functionals on any normed linear 
space: 


Lemna. If a functional F(y) is differentiable at y = yo, then for 
any h the function F(y) + £h), regarded as a function of t, is dif- 
ferentiable with respect to t in the usual sense at t = 0, and its deri- 
vative is equal to L(y, h) = dF (yo, h). 

Proof. The required derivative is the limit of the expressions 


F(yg + th) — Fly) — OF (yo, th) + r(yo, th) 
t t 
7 (Yo, th) 


= OF (yy, h) + ne Sas (2) 


By the condition of the lemma, for any ¢ > 0, there exists 6 > 0 
such that, for |£h|| = |¢| |[A|| < 6, we have the inequality 


Ir(yo, th)| < eth] =e |e] (Al. 
It follows that the quotient 


o> th); 
Fat ofa 


and as t -> 0 can be made as small as we please. This shows that 
the expression on the left-hand side of (2) has the limit 6 F(y, 2) 
as t > 0, as required. 

Note. If the functional F(y) is differentiable everywhere, then for 


fixed y, h, the function F(y + th) is differentiable at every t. For, 
putting ¢ = f& +t, we have 


Fly +th)=Fiy + th + th) = Fly +th) (Yo =y + bh); 


by what we have proved, this function of t is differentiable with 
respect to t at t = 0; but then the function F(y + th) is dif- 
ferentiable with respect to ¢ at t = fy. 


82 MATHEMATICAL ANALYSIS 


3. Examples. 1. A continuous linear functional F (y) is evidently 
always a differentiable function, since 


Fly +h) — Fly) = F(A) 


and the total increment of the functional reduces to a linear ex- 
pression in h. 


2. Let us consider the functional 


b 
=fi@y@)d 
a 
on the space C (a, b) of continuous functions on the closed interval 
[a, bj]. The kernel f(x, y) is assumed to be continuous and to have 
continuous first and second partial derivatives (in the usual sense) 
as a function of its arguments in the regiona Saw <b, -co<y 
< oo. We give the functional F(y) an increment in which the 
functional argument y(x) receives a displacement h(x): 


AFly )= Fy +h) — Fly) 
= ftfey) + ie x)] ~ fle, y(w)]} de. (3) 


In accordance with the definition of f(x, y) we have 
af 
fay +h) — fey) = ay" + r(x, y,h), 


where, for fixed a, y, r(x, y, 2) is an infinitesimal of order higher 
than h. In each bounded (in respect of y) region, the second deri- 
vatives of the function f(x, y) are bounded in absolute value by 
a number, NW say, and by what has gone before, r(x, y, h) is bounded 
according to the inequality 


Ir(x, y, h)| SN hp. 


Hence the integral (3) assumes the form 


b 
AF (y) = [EY nea + el), 


a 


THE CALCULUS OF VARIATIONS 83 


where the function ¢(y) is bounded in absolute value by the number 


b b 
fir@ y, h(x))| da SN f |h(z)| de < N(b — a) max |h?(x)| 


<N(b—a)|Al?. + 


We see that the increment of the functional F(y) is split into a 
principal linear part and an infinitesimal of higher order. Thus 
it is differentiable on the space C(a, b) and its variation has the 
form 


b 
5F ly, h) = [EY nea 


a 


3. We consider the functional 
-f f(x, y(x), y'(@)) da 


on the space D, (a, 6) of continuous functions on the closed interval 
(a, b] with continuous derivatives of the first order. The kernel 
/(x, y, y’) is assumed to be a continuous function, defined on the 
region a Sab, -w<y< wo, —wo<y' < oo, and having 
continuous derivatives up to the second order. We give the func- 
tional F(y) an increment in which the argument y goes from 
y = y(x) toy = y(x) + A(x), with L E€ D, (a, b): 


AF(y) = Fly + h) — FY) 
= fe {flx, ya) + h(x), y! (x) +h (@)] — fey, y'de 


- fiz h(«) + aM )]ax + fren h, h') da. 


The first component is evidently linear in the displacement h(x). 
If N denotes the upper bound of the second derivatives of the 
function f with respect to the arguments y, y’ in some region 
bounded in respect of y, y’, and further, since we know that 


|e, = max {A ()|, |A'(@)]}, 


{ Here and elsewhere {/(x)}* is written as /?(z). 


84 MATHEMATICAL ANALYSIS 


in space D,(a, 6), on writing uw = ||h||, the second component will 
be governed by the inequality (cf. unequality (1')): 


b b 
fir(, y, h, y', h')| dx S4N [ u?de = 4N u*(b —a). 


The expression obtained is of the second order of smallness with 
respect to ||h||. We see that the functional F(y) is differentiable 
on the space D, (a, 6) and its variation has the form 
b 
of of 
ork) = {| sr he) + 5h 
a 


ay h' | dz. 


4. Similarly the functional 


b 
Fy) = f flz,y(a), «.. y™ (x) dx 


is differentiable on the space D,, (a, 6); its variation has the form 
b 


a. OF baa OF aim 
OF - {Se (a) +o + 5 hoa) | de, 


Here again we require the existence and continuity of the nth 
derivatives (with respect to all arguments) of the function f. 

5. We can also consider functionals which have as argument a 
function of several variables. For definiteness, consider the func- 
tional 


Fe) =f[ft@y2z@y, (ey) %(e y)dady, (4) 
G 


where we have written for brevity 


- dz j 02 

vou? YY Oy” 
It is natural to consider it on the space D,(G@) of functions z(z, y), 
continuous together with their first partial derivatives on a region 


G. The norm in this space is given by the formula 


02 (x, y) 
oy 
Assuming that the function { has continuous second derivatives 
with respect to all its arguments, we form an expression for the 


O2z(x, y) 
Ou 


Jz] = mex {le wh 
@ yw EG 


THE CALCULUS OF VARIATIONS 85 


increment in the functional F corresponding to an increment 
h(«, y) in the functional argument z(z, y): 


= JJUe, ¥y, 2% + h, &_ + hy, ey ote hy) mz I(x, Y; 2, Zz, %y)] dady 


~ [flee +ah oa Tag, oe iy]azdy 


+ ff L,Y, 2; Zz52y,h, hy, hy) dady. 
G 


The first component is linear in h; a bound for the second is given, 
for ||h| = uw, by the inequality 


f{\r@¥, 2,20 2%, he, by)|dady <9Nf fu2dady=9N p|G|, 
G G 


where NV denotes the upper bound of the absolute values of the 
second derivatives of f, and |G| is the area of the region G. The 
functional (4) is therefore differentiable on the space D,(@), and 
its variation has the form 


OF (z, h) ~SI lee 2 + ie y| dady. 


6. We now consider a functional dependent on several functional 
arguments, say 


6 
Pa, +1 Yn) = f He, Yx@)s + Ya), Ys (®)s +s Ya (a) dx 


It is natural to consider it on the space D{” which has as elements 
the vector functions y = (y,(2), .-- Yn(x)) (a Sx <b). We define 
the norm of a vector function (y = y,(x), ---, Yn(%)) by the formula 


lal mee x ae Ip er Lyne), Lys @) [> +2 LY @)I}- 


It is easily verified that all the axioms for a normed linear space 
are satisfied. 

Assuming that / has bounded (by some number JN) second deri- 
vatives with respect to y,, --, Yn, we evaluate the increment of 
the functional F corresponding to an increment h = (hj, ..., hn) 


MA. 4 


86 MATHEMATICAL ANALYSIS 


in the vector function y = (y,, ---, Yn): 
b 
AF (yy, ++) Yn) = f f(t. ¥2 + dys son + hy) da 
ae: 
= J fe, Yrsevrs Yn) da 


b 
of of a, 
z ier pene mace mae 
fl OY, . OYn 
3) 


+ fre, Yroeees Yas ys yh) da. 


The first term in the result obtained is linear in h = (A, «.., hy). 
A bound for the second term is given by the inequality 


b b 
fire, a, very Yrs Riyy oy Ap) | Ae SnD f 2 da = n? N (b — a) p? 


for ||k|| < uw. This quantity is of the second order in yw; hence the 
functional F is differentiable and its variation has the form 


b 


3 3 
S Fly, b) -{[¢ hye tek a mae. 


a 

Problems. 1. Determine whether the following functionals are differentiable : 
(a) F(y) = y(a) on the space C(a, b); 
(b) F(y) = y(a) on the space D,(a, 6); 
(c) F(y) = y?(a) on the space C(a, 6); 
(d) F{y) = yl + y'*(a) on the space D,(a, 6); 
(e) F(y) = |y(a)| on the space C(a, b). 

Answer. (a)—(d) are differentiable, (e) not. 


2. Show that if the functional F(y) is differentiable, so is the functional: 
F?(y). Give the variation of F?(y). 


Answer. 6 F*(y, h) = 2F (y) 6 Fly, A). 


4. Just as in classical analysis, the remainder r(y, h) in some 
cases admits of a further interpretation. We suppose that the 
remainder r(y,) of an increment of a differentiable functional 
Fy), obtained by extracting the principal linear part, can be 
resolved into a quadratic functional and a new remainder with 


THE CALCULUS OF VARIATIONS 87 
higher than second-order smallness 
(yb) = 5 Qs h, b) + (yh), 
so that, for any 6 > 0, we can find ¢ > 0 such that, for h] < 6, 
Italy, &)| < eh)? 


In this case the quadratic functional Q(y, h, h) is said to be the 
second differential or second variation of the functional and the 
functional F(y) is said to be twice differentiable. The second 
differential, like the first, is uniquely determined. 

Functionals F'(y) of integral type, which we considered above, 
e.g. 


b 
FY) =fi@wyy')de, (1) 


on the space D,(a, b), are twice differentiable if the integrand f 
has continuous derivatives up to the third order. Expressions for 
the second variation of such functionals are easily obtained by 
expanding f in its Taylor expansion for the argument y +h 
as far as the third-order terms. Thus, for the functional 


b 
Fy) = f tz, y)de 
the second variation takes the form 
b 
SF (y, bh) = f fyyl, y) Wada; (2) 
for the functional (1), ° 
Fy, h) = fw H+ Qf hh! + fry Wax; (3) 
for the functional 
b 
Fy) =fi@yys .ym)dex 


we have 


b 
@F = f yy WA we A fy ayy ay hO RO + + + fy my (my (OP ] da; 
a (4) 


4* 


88 MATHEMATICAL ANALYSIS 


for the functional 
F(z) = fft@ Y, 2, 2x, %y) Axdy 
G 


we have similarly 


OF = ff fee B® + feo lhe + + + fayey Rzldady; (6) 
G 
finally for the functional 


b 
Fy, seey Yn) = ft, Yrs vo Uns Yi» fees Yn) dz 
a 


we get 
b 
OF = [LY fran bila + SS hye hibt +S fyays te byldx. (6) 


Problems. 1. Establish the uniqueness of the second variation. 
2. Show that a quadratic functional is differentiable and find its first 
and second variations. 


Answer. ODA (y, y;h)= Aly h)+A(hy), 84 (y,ysh, h) =2 Ah, h). 


3. Find the second variation of the functional e?”, where F(y) is a twice 
differentiable functional. 


Answer. SeF) == {16 (y, h)]}? 6? Fy, h, h)} e?™. 


2. Exrrema OF DIFFERENTIABLE FUNCTIONALS 


1, Let us consider some differentiable functional F on a normed 
linear space H. We set ourselves the task of finding the points y 
at which the functional F attains extremal values—maxima or 
minima. 

By definition, F has a relative minimum at a point y, if there 
exists a neighbourhood of yy (a sphere with centre y,) within 
which the inequality 


Fy) 2 F(y) 


is satisfied. If y) has a neighbourhood within which the converse 
inequality 
Fly) S FY) 


is satisfied, F is said to have a relative maximum at y,. In analysis 
the relative extrema of differentiable functionals are determined 


THE CALCULUS OF VARIATIONS 89 


by equating their differentials with zero. We shall show that a 
similar principle applies to differentiable functionals on normed 
linear spaces. 


Lemna. At any extremum y, of a differentiable functional F (y), 
the first variation 6 F (yo, h) of F is equal to zero for any displacement h. 

Proof. For an arbitrarily assigned h we consider the function 
F(yy + th) of the variable ¢. This function is differentiable with 
respect to ¢ and takes on an extremal value at t = 0. Hence its 
derivative becomes zero at ¢ = 0. But by the lemma of Section 1 
this derivative is equal to 6 F (y), 2). Thus for any h the expression 
6 F(yo, 2) is equal to zero, as required. 

Any point y, at which the first variation dF (y), h) of a func- 
tional F(y) vanishes for any h is said to be a stationary point of the 
functional. 

We have then to determine at which points y, the variation 
OF (yy, h) vanishes (identically in ), and we shall then have the 
stationary points of the functional. But we have further to identify 
which of all the stationary points are maxima and minima, in 
which we are interested. 

2. If the functional F(y) is twice differentiable, we can turn to 
its second variation in investigating this last question. Since the 
first variation of the functional vanishes at a stationary point y9, 
the increment of the functional under a displacement A from yp 
is expressible in the form 


1 
AF= —8F (yy, h) + 7, (Yo, 4), 


bo] 


where the quantity r,(y), h) is, for any « > 0, bounded on the 
sphere ||h|| < d(e) by the inequality 


Ir: Yo, &)| Selb]? 
This bound can be expressed in the form of an equation 
11 (Yo, h) = OellAl?, where —1 S01. 


Hence for ||| < d(e) the expression for the increment of the 
functional reduces to the form 


1 
AF (yg, h) = 5 OF (Yo, h) + 0 el h|?. (1) 


90 MATHEMATICAL ANALYSIS 


We can now formulate a necessary and sufficient condition for 
a minimum: 

(a) If yo is a point at which a functional F (y) is a minimum, then 
& F (yo, ho) = 0 for any hy 

(b) If at a stationary point yo the inequality 


&F(y,, kh) = Chi? (C > 0 fixed), (1') 


is satisfied, yy determines a minimum of the functional F (y). 
For the proof of assertion (a) we suppose the contrary to be 
true, Le. we suppose that for some hy, 6? F (yp, hy) < 0. We choose 
|OF (y, hy)| 
é< 2] hol? 
[Pl = |#] |2ol < 6). Then 


and put h =thy, where ¢ is so small that 


1 
AF (yy, h) = x PF: tho) + 0 el tho|? 
2 
= [Ae + Oe role] <0, 


so that the functional F(y) does not have a minimum at the 
point yp. 

To prove assertion (b) we choose ¢ < C/2 and find a lower 
bound for the increment of the functional F(y) on a sphere of 
radius d(e), centre y,. Using (1’) in (1), we get 


1 
AF (yp, h) = 55°F Wo, b) + Ohl? 
> lae(S +08) >0 for h+0, 
and hence the functional F(y) has a minimum at the point yp. 
In general condition (b) cannot be weakened by substituting 


what would seem the natural condition 62F(y), h) > 0 for all h. 
A counter example to this condition is the functional 


4 4 4 
Fly) = fay?(w) da — f y>(x)de = [y*(a) [x — y(x)]da 
0 0 0 


on the space C(0, 1). The point y(x) =0 is a stationary point for 
this functional, and the second variation 


4 
OF (0,h) = fel (x)de 
0 


THE CALCULUS OF VARIATIONS 91 


is positive for each function A(x) == 0. But in any neighbourhood 
of zero the functional also assumes negative values; for a given 
€ > 0, it is enough to take as y(x) any non-negative function that 
is positive at x = 0, does not exceed e — x for x < «, and vanishes 
for z Ze. 


Problems. 1. Show that a linear functional, not identically zero, has no 


extrema. 
2. Show that the extremum theory of the functional 


F(y)=f(y@)|2=a 


on the space C(a, 6) coincides with the usual extremum theory for a func- 
tion f(&). : 
3. Show that the extremum theory of the functional 
Fiy)=fy@, y()) 


on the space C(a, b) coincides with the usual extremum theory for a function 
of two variables f(é, 7). 

3. By way of illustration we shall analyse the extremum prob- 
lem for a functional of type 


b 
Fly) = f f(@, y(@)) de. 


This functional has a first variation (p. 83) 


b 


Fly, h) = [EP aeyae. 


a 
Let ¥o = ¥9(x) be an extremal point. The expression L F (yp, h) 
must vanish at this point for any function L(x). We have the 


equation 
b 


Of (x, 
[7B eae =o. (2) 


It will be shown below that such an equation can only hold pro- 
vided that 

afte) = g 

oy 

If this equation is solved for y), it will in general yield one or 
several functions of x, and these are the only elements of the 
space C(a, b) that give extrema of the functional under conside- 
ration. 


identically in a. (3) 


92 MATHEMATICAL ANALYSIS 


The second variation of the functional F(y) has the form (p. 87) 
b 


&2F (yo, h) = [SE meyae. 


If it is non-negative for all h(x), then clearly f,, 20. The in- 
equality f,, = 0 is therefore a necessary condition for a minimum 
of the functional F’. On the other hand if we have f,,,(x, yg(x)) > 0 
for all x, then a stationary point is a minimum of the functional, 


since 
b 


AP (yosb) = f [5 fav (es vole) @) 


a 


+ fray (ts ola) + O(2) B(x) B(x) 


b 
1 1 
= [2 0@) [5 fuy + 3 hoyy hla >0 


for sufficiently small h. 

4. We turn now to equation (2). We have to show that if it 
holds for all continuous h(x), then equation (3) holds. To do this 
we employ a lemma which will be of repeated use to us: 


Lemna. If, for a continuous function A (x), the equation 
6 
fA) he) de =0, (4) 
a 


holds for any continuous function h(x), then A(x) = 0. 

Proof. lf A(x) += 0, there exists an interior point x, of the closed 
interval [a, 6] at which A (x,) = 0. For definiteness let A(x) = ¢ 
> 0 and let U(x) = {x: ]2 — | < 6} be a neighbourhood of a, 
in which the inequality A(x) > ¢/2 is satisfied. We consider any 
non-negative function A(x) that vanishes outside the neighbour- 
hood. U (x,) and is positive at x = x). Then it is evident that 


b 
fA naan = f A(z) h(x) de > £ i h(x) dx > 0, 
a |z-a2) <6 |\u-— 2] <6 
which contradicts the original hypothesis. The lemma is proved. 


Note 1. The class of functions h(x) for which we must demand 
that equation (4) hold in order to validate the result can be con- 


THE CALCULUS OF VARIATIONS 93 


siderably reduced; as is evident from the construction, we can 
impose on h(x) any smoothness restrictions we like (right up to 
infinite differentiability). We can also assume h(x) to vanish in 
the vicinity of the end-points of the interval. We observe moreover 
that the lemma remains valid in the analogous formulation when 
a bounded region G is substituted for the closed interval [a, b] 
and in the case of several independent variables. 


Note 2. We can easily convince ourselves by reviewing the for- 
mulations above that they hold in somewhat more general situ- 
ations. The functional F(y) can be defined, not on a whole space ZL, 
but on some subset E’ c # with the property that, together with 
any two points y, y + 4, it contains all the points of the form 
yt+th, — 0 <t< co, in other words, it contains the whole 
line determined by the points y, y + h. A subset HE’ c E which 
possesses this property is said to be a linear manifold in EL. In prob- 
lems to be considered presently, a functional, defined and dif- 
ferentiable on a whole space #, will be considered just on some 
linear manifold H’ c E£, and we shall seek those points y) € EL’ at 
which the functional assumes a value that is extremal relative to 
displacements in the linear manifold. To solve such a problem, we 
have to consider the variation 6 F(y, h) only for y € #’, and look 
for the points y, for which 6 F(y), 4) vanishes for any displace- 
ment / that does not take us outside the manifold EF’. Similarly 
for the second variation 6? F (yg, A). 


b 
3. FUNCTIONALS OF THE TYPE f {@y,y) daz 


We discuss these functionals, which are frequently encountered 
in problems of mathematics and mechanics, in some detail. 
1, As we saw in Section 1 (p. 84), the functional 


b 
FY) =ffeyy')de, 


on the space D, (a, 6) has the variation 
b 


SF ly, h) = [ [sone + Sa («)| de. q) 


MA. 4a 


94 MATHEMATICAL ANALYSIS 


To find the extremal points of the functional F', we must equate its 
variation 6 F(y, h) with zero. For the required functions y = y(z), 
we can obtain several variants on the conditions, depending on the 
manifolds on which the functional is defined. 

We consider first the case in which the functional F is defined 
on the set of functions y(x) which take on fixed values y(a), y(b) 
at the points a, 6. The function /(x) must then vanish at the end- 
points of the interval [a@, 6]. This set is evidently a linear manifold 
in D,(a, 6) and we can treat it in accordance with note 2 of Sec- 
tion 2. 

We suppose further that the solution y = y(x) sought possesses 
a continuous second derivative. (This restriction will subsequently 
be withdrawn.) Then the coefficients of both h(x) and h’ (x) in the 
integral (1), where the required solution y(x) has been substituted 
for y, will be differentiable functions of x. Integrating the second 
term by parts, we get: 

b 


'- flee h(a) de. 


a 


b 
Oho _ Of 
[arr (2) de = he) 


The first term on the right vanishes since h(a) = h(b) = 0. Hence 
the expression for the variation is modified to the form 
b 


OF (y, h) - {| a < a h(w) da. 


a“ 


At the required extremal point the variation dF (y,h) must 
vanish for any h(x). It follows (as in Section 2 above) that the 
coefficient of h(x) vanishes identically. And so for the unknown 
function y = y(x) we have the equation 


Of d df 
dy dx dy’ 


= 0 (2) 


(the Euler equation). Expanding the total derivative with respect 
to 2, we can write this equation in the form 


ty = fey = luy y — hyy y' =0. 


This is an ordinary differential equation of the second order, linear 
in the major derivative. It follows that if an extremum of the 
functional F exists and is attained at a function y = y(x) that 


THE CALCULUS OF VARIATIONS 95 


possesses a derivative of the second order, then the function y = y(zx) 
satisfies Euler’s equation. 

The general solution of Euler’s equation, as of any second- 
order differential equation, contains two parameters C,, C,. Each 
independent solution that obtains for fixed C,, C,, is said to be 
an extremal of the functional F. By a suitable choice of the con- 
stants C,, C, we can generally find an extremal that satisfies 
prescribed conditions y(a) = y,, y(b) = y, (there may be several 
such extremals). If no solution of Euler’s equation satisfies these 
conditions, this signifies that our extremum problem has no 
solution in the class of twice differentiable functions. 

2. Consideration of the second variation of the functional F 
enables us to obtain further necessary conditions for an extremum 
which are also sufficient. 

As we found in Section 1 (p. 87) the second variation has the 
form 


b 
BF = EL yy 2 + fy WB! + fyy Bde. 
a 


The middle term can be transformed as follows: 


b b b 
1 1 d 
[fay ht! de = [Shwe == 5 [Ota de, 


and hence 


b 
OF =f [P@)P +t fyy hI de, (3) 


J d 
where P(a) = 2 (iv _ ae hw): 


We claim that for minimal points y = y)(x) the inequality 
fyry (% Yo); Yo(x)) 20 


holds for any x in the interval [a, 6]. 

For suppose that at some point 2, @ <x) 35), the expression 
fy y(t, Yo(%), Yo(y)) is negative. Then it is negative in some neigh- 
bourhood U of x. Let Lo(x — x) € D,(a, 6) be a function that 
takes on values between 0 and 1 on the neighbourhood U, equals 
1 at the point x), and vanishes outside U. We can always find an 


4a* 


96 MATHEMATICAL ANALYSIS 


open interval of length, say 6, on which ho(z — x) 2c > 0. We 
consider the expression (3) for the second variation under a dis- 
placement h,,(x) = ho[m(x — x9)], when m —> oo. The first term 
is bounded in absolute value by the quantity 


b 
f|P@|de, 


the second evidently tends to — ov, since by hypothesis f,, ,, < 0 
on the neighbourhood U and the derivative of the function h,, (x) 
is known to exceed m? c? on an open interval of length 6/m in this 
neighbourhood. Hence for sufficiently large m, 6? F takes on nega- 
tive values under the displacement h,, (x), but then the functional 
F cannot have a minimum at the point yp. 

The inequality fyr yr (2, Yo(x), yo(x)) 20 is therefore a necessary 
condition for the functional F to have a minimum at the stationary 
point y (x). This condition is known as Legendre’s condition. 

The determination of convenient sufficiency conditions for a 
minimum is considerably more complex. 

We cite here proof Weierstrass’ sufficiency condition. 

Suppose that an extremal y = y(x) can be included in a “field 
of extremals”, ie. in a single-parameter family of extremals 
y = y(x,«), where — € <a < € and y(x, 0) = y(x); the function 
y(x,o) is in addition differentiable with respect to «, dy/dx > 0, 
and for distinct of « the curves y = y(x,x) on the interval 
a <a <6 do not intersect. Then, 7 for all x, y in the region covered 
by the extremals y = y (x, x) the inequality 


fyy(%¥; tT) > 9, 


as satisfied for all, the extremal y = y(x) determines a relative 
minimum of the functional F on the space D,; moreover, of all the 
curves y = y(x) € D, that for sufficiently small B satisfy the in- 
equality 

ly(x) — p@)| <B, 


the functional F takes a minimum value on the curve y = y(x) what- 
ever the derivatives y'(x) may be. On the other hand, if, with the 
given conditions, the inequality 


fyry (%,Y¥, 7) <9, 


THE CALCULUS OF VARIATIONS 97 


is satisfied, then the extremal y = y (x) determines a relative maximum 
with the same properties.t 

We give some concrete problems of geometry and mechanics 
that reduce to problems of determining the extrema of functionals 
of the given form. 

(a) The functional 


b _——— ete 
Fly) = [Yl +y?de 


expresses the arc-length of the curve y = y(z) for a Sa <b. 
The extremum problem for this function can be formulated as 
follows: of all the curves y = y(x) that join prescribed points 
(a, y(2)), (6, y(b)) a curve of least length can be found. In the given 
instance the function f(x, y, y’) = V1 + y'? is independent of y, 
and so Euler’s equation (2) assumes the form 


d 
adel ~% 
from which it follows that 
y' 
} , => —>— = const; 
"Vit y? 


but then y' = const. also, and hence the solution is given by a 
linear function y = C, + C,; the required curve is the straight 
line joining the given points. 

It is clear that we are dealing here with a minimum; we now 
consider the form assumed by the Weiserstrass condition. The 
extremal y = Cx + C, can evidently be included in the field 


y=CutC,+a0, -e<a<e. 
Further, 
] 
hen OE Pe 
and hence 


1 
fy y (%, YT) “T7 ae 0, 


so that the Weierstrass condition is satisfied. 


+ The proof can be found in M.A. Lavrent’ev and L. A. Lyusternik, 
A course of variational calculus, for example M.-L., Chapter 8, (1950). 


98 MATHEMATICAL ANALYSIS 


(b) We pose the analogous for a surface given by the parametric 
equations 
e=x(u,v); y=ylu,v); 2=2(u,2). (4) 
It is well known that the arc-length of a curve v = v(u) joining 
points A, B on the surface (4) can be expressed by the integral 


F(v) = [ye (u,v) + 2F(u, 52 + Ou w( yan, (5) 


where H, F, G are the Gaussian coefficients for an element of arc. 
Putting the general case aside for the moment, we consider the 
case of a sphere given by the equation 
x = cosy cosy, y = sing cosy, z = siny (6) 

in spherical coordinates (wu = y, v = py). The Gaussian parameters 
take the form 

G = Uy Xy + YoYo + 2% 2% = COs p, 

FF e= ty ty + Yo Yy + %m% =O, 

EH = %y%y + Yy Yy + %y%y = 1, 
and. the functional (5) ue form 


s=[ |vonln 


d. 
Fy Saeed age = @, 


just as above, admits a first integral 


en 


yp. 


The Euler equation 


2 
Fy =e nO 1 
y2 + cos? p+’? 


which can also be written in the form 
dg a C 
dy — cos pcos? y — C2” 


The general integral of this equation is obtained by making the 
substitution tan py = t: 


C 
sin(g + ONO tany (Cr = 7) 


THE CALCULUS OF VARIATIONS 99 


or 
sin y =o sing cosy + B cos@ cosy, 


where x, § are new constants. Reverting to rectangular coordinates 
in accordance with (6), we get 


z=axr+ By. 


We have obtained the equation of a plane passing through the 
origin of coordinates; the curve in which it intersects the sphere 
is an arc of a great circle. Thus the lines of least length on the 
sphere; if they exist, are arcs of great circles. 

We verify that the Weierstrass condition is satisfied. Provided 
that the two points selected are not diametrically opposite (so 
that cosy = 0), the are of the great circle joining them can 
evidently be included in a field of extremals. Further, 


R _ cos? y 
i a [l + gy’? cos? ps2 * 
so that 


cos? 
Fog (y, Q; T) = fl = 0, 


[ha wootyra 
and the Weierstrass condition is satisfied; thus the are I" of the 
great circle that joins the two given points really does determine 
a minimum length in respect of all the curves that join these 
points and pass sufficiently close to the curve J’. 

Before proceeding to the two remaining examples, we shall 
make a practical observation concerning the integration of Euler’s 
equation when the function { = f(z, y, y’) is independent of the 
argument x, so that 


f=fyy'). 


We shall regard y here as the independent variable, and x as a 
function of y subject to definition. Then the functional F reduces 


to the form 
Ya 
1 t 
F = [t(y.—}e'ay. 
"1 
Now the Euler equation 


d 1 
In — ay” =0, where g = t(y)x ; 


100 MATHEMATICAL ANALYSIS 


will have a first integral 
9x. = const, 


just as above, or what is the same thing, 
1 i 
# tryw)(—se) + fy) =f-v yw) =e. 


It remains to integrate the first-order equation obtained, and this 
can be done by quadratures since x is absent. 

We now consider the following examples: 

(c) An extremum of the functional 


b 
Fly) =20f yVl + y2dx 


provides the solution to the following problem: to find the curve 
y = y(x) joining prescribed points (a, y(a)), (b, y(6)) for which the 
area of the corresponding surface of revolution about the z-axis 
is least. 

To solve Euler’s equation, we apply the foregoing procedure. In 
this case the first integral (7) takes the form 


1 


yVl+y?—y 
ity? 


or, what is the same thing, 


With the substitution y = C cosh f, the equation is easily inte- 
grated to 


y = Coosh (F + G). 


The curve y = y(x) will be the one required if it belongs to this 
two-parameter family and passes through the given points (a, y(a)), 
(0, y(b)). For the sake of simplicity we shall suppose that a = — 6, 
y(a) = y(—a). Then C, = 0 and the whole family of extremals 
reduces to the single-parameter family 


x 
y= Ceosh | (8) 


THE CALCULUS OF VARIATIONS 101 


which is obtained from the catenary y = cosh x as the family of 
all possible projections with centre at the origin of coordinates. 
Depending on the position of the points (a, y(a)) and (— a, y(—)) 
in the plane, three distinct possibilities can arise: the number of 
catenaries in the family (8) that pass through the points may be 
two, one, or none at all (Fig. 5, the pairs of points A,, B,; Az, Ba; 
Ag, B, respectively). We shall consider the situation in respect of 
Weierstrass’ sufficiency condition. We have 


ae ee: ee 
fyry = 27 (+ y2)32’ 


this quantity is positive for y > 0 and any y’ = t. The upper of 
the two possible extremals that join the points A, and B, can al- 
ways be included in a field by employing extremals of the family (8). 
The Weierstrass condition is thus satisfied, and the upper extremal 
therefore determines a relative minimum of the functional F. The 
lower extremal cannot be so included, and the Weierstrass con- 
dition is not satisfied. We must therefore leave open the question 
as to the nature of the extremum determined by it. A more precise 
investigation will show that the lower extremal yields neither a 
maximum not minimum. 

In the case of a single extremal joining the points A,, By, it 
can also be included in a field and so determines a relative minimum. 


Fra. 5 Fia. 6 


In the third case there is no member of the class of twice differen- 
tiable curves joining the points A,, B,, at which the functional F 
attains a relative minimum. 

(d) In 1696 J. Bernoulli proposed and solved the following prob- 
lem, which became an important landmark in the development 


102 MATHEMATICAL ANALYSIS 


of the calculus of variations: What is the curve that must join two 
points A, B in a vertical plane in order that a point mass constrained 
to slide along it under the force of gravity should pass from A to B 
in the least time? 

The required curve was termed by him the brachistochrone. 

To commence the solution of the problem, we first find the time 
taken for the point M@ of mass m to slide along the prescribed 
curve from the first point to the second under the action of gravity 
and starting from rest. We choose the origin of coordinates at the 
first of the prescribed points and fix axes as shown in Fig. 6. 

We show first of all that at a point with coordinates x, y the 
velocity of M is v = 72g y. To do this we resolve the gravitational 
force P = mg into its normal and tangential components; the 
former plays no part in the motion, but the tangential component 
produces a tangential acceleration equal to gdy/ds. We have 


dy dy ds 
di Yds’ “dt 


dividing these equations one by the other, we eliminate ds and 
dt and arrive at the equation 


vodv =gdy. 


Integrating this equation and taking into account the initial con- 
dition y = 0, v = 0, we get 


v= V2gy, 
as required. We have further 


/ 12 
dé ds ty dz 
v V29y 


and for the required duration, we obtain the expression 


It is evident that F(y) = F(y(zx)) is a functional depending on the 
choice of the function y(x). In this instance the function f is no 
longer twice differentiable with respect to y; nevertheless, a more 
precise investigation will disclose that here also F is differentiable, 


THE CALCULUS OF VARIATIONS 103 


and its variation can be calculated in accordance with formula (1). 
Just as in the preceding example Euler’s equation has a first 


integral 
y1 + y? y!2 
————— = Cc 
V2gy = Y2gy VL + y? 
or, what is the same thing, 


f-y'ly 


C1 I 
oe y'? (c. . 2908) 
It is convenient here to go over to the parametric representation 


y’ = tang. 
As a result, we get 


y= CO, cos*y, y' = —C,sin2p-g’ = tang, 


so that 
, dp | 1 
a Ca C, cos? 
and hence 
0. 
a= S59 + sin 29) + C4. 


Replacing 29 by x — 6, we get a simpler parametric form of the 
solution : 
z=a(é—sinf) +6, y=a(l — cos), 


where a, b are new constants. 

Thus the extremals form a family of cycloids with cusps on the 
x-axis. A unique curve of the family is determined by the conditions 
of the problem y(0) = 0, y(b) = C, and this is the curve required. 

Note 1. In the problems considered we have found curves 
y = Y, (x) which determine a relative minimum of a functional 
Fy). This is not quite an answer to the problem proposed: to 
find the curve g = y,(x) that determines the absolute minimum 
of the functional F(y) (i.e. a minimum in respect not only of suf- 
ficiently close curves, but of all functions y = y(x) for which F(y) 
is defined). Of course the required minimum will also be a relative 
minimum if it is attained for some curve y = y (x), and will then 
be yielded by our methods. But it may be that no smooth curve 
determines the absolute minimum. This is known to be the case, 
for example, in the problem of a minimal surface of revolution, if 


104 MATHEMATICAL ANALYSIS 


the prescribed points through which the generating curve must 
pass are sufficiently far apart. Moreover our results by no means 
exclude the possibility that, even in the “good” case when the 
prescribed points are close together, the extremal y = y9(x) joining 
them will determine a relative and not an absolute minimum (i.e. 
although the inequality F(y) 2 F(y,) holds for curves close to 
y = Yo(x), it may be that there are other curves for which 
F(y) < F(yo) and that there is no smooth curve for which F(y) 
attains an absolute minimum). In fact this is not the case. There 
is a general theorem, called the Hilbert-Tonelli theorem, that 
guarantees the existence (in the class of rectifiable curves) of a 
solution to the extremal problem. However we cannot dwell here 
on the proof ft. 

Note 2. In the course of the discussion we assumed that the re- 
quired solution y = y(a) has a continuous second derivative. We 
can avoid making this assumption if we adopt a slightly different 
approach, which we now describe. 

We recall that the variation of the functional 


= fil (x,y, y')dx, 


has the form 
dF (y, h) -f [fy ble) + fy hi (w)] de. (9) 


We transformed this expression, integrating the second term by 
parts in order to eliminate h’(~) and deal only with h(x). But we 
can adopt a different procedure integrating the first term by parts 
to eliminate h(x) and leaving only h’ (x) to deal with. It turns out 
that with this method, not only is it unnecessary to assume the 
existence of y’’, but we can even prove its existence. Thus, integrating 
the first term in (9) by parts, we find 
oT g(a) h’ (x) d 


b 
Of (%, yy’) 
h = 
f By (x) dx = g(x) h( (7) ) 
where g(x) denotes the primitive of the faidiéa Of (a, y, y')fay 
(we remember that y is a function of ~ and hence so also is 


a 


+ Cf. N. I. Akhiezer, Lectures in the Calculus of Variations, State Tech- 
nical Publishing House, 1955, Chapter IV, Sections 33-36. English trans- 
lation The Calculus of Variations (Blaisdell, New York, 1962). 


THE CALCULUS OF VARIATIONS 105 


of (x, y, y')/dy. Since h(a) = h(b) = 0 the term without an integral 
sign vanishes, and we get 


b 


5 Fly, h) - {| spies a meee ~0. 


a 


It will be shown below that this equation holds for all permissible 
h(x) only if 
of 


= const. (10) 


The function df (x, y, y’)/dy’ as a function of the variable x, is not 
in general differentiable. But in the given instance, equation (10) 
shows that it is differentiable together with g(x). 
Differentiating the left-hand side with respect to 2, we arrive at 
Euler’s equation: 
of d of _ 


oy de oye 


A total derivative for the second term cannot be exhibited until 
the existence of y'’ is proved. We show that y"’ exists wherever 
fury (a, y, y') ts non-zero. 

The derivative d/da (dfdy’) is the limit as Ax — 0 of the expres- 
sion 


fy{x + Ax, y(e + Ax), y' (x + Ax)) — fy (x,y, y') 
Ax 


—8fy Shy Ay, hy Ay’ 
On dy Az oy Ax’ 


where a bar denotes that the corresponding expression is evaluated 
for some intermediate values of its arguments. The bar is dispensed 
with as Ax +0 and the corresponding expression is considered 
at the original values of the arguments. In addition, 4 y/Ax tends 
to y' (x). Since 0f',/dy' is stipulated to be non-zero, the expression 
Ay'/Ax has a limit, and this signifies the existence of the second 
derivative. 

It remains for us to prove the following lemma of Du Bois- 
Reymond: 


106 MATHEMATICAL ANALYSIS 


If for some continuous function A (x) 
b 
fA(x) hi (a) dx =0, (11) 


for any function h(x) € D,(a, 6) that vanishes at x =a and x = b, 
then the function A(x) is constant. 

For the proof, we suppose that the function A (x) is not constant 
and that there exist points z,, x,, say, for which A (7,) < A(a,_).We 
show that there exists a function h(x) € D,(a, b), h(a) = h(b) = 0, 
for which equation (11) fails to hold. We take an arbitrary number C’ 
with value between A(a,) and A(#,). Since the function A (x) is 
continuous, we can find disjoint open intervals 4,5 2,, 4,5 2% 
such that for any x’ € 4,, 7” EA, 


A(a') <0 < A(z"). 


As h' (x) we take any continuous function that is positive on 4,, 
negative on A,, and zero outside A, -+ A,, and is such that 


b 
fii w)da = fh'(x)de + fi'(x)dx =0. 
a A; As 
the function h(x) is then defined naturally by the formula 
h(x) = f hi (é) dé; 


evidently h(x) € D,(a, 6) and h(a) = h(b) = 0 
We have further 


t 
f [A(@) — Cl A’ (w) de = f (A(x) — C]h'(w) da + 
a Ay 
+ f[A(x) — Ch’ (2) dx <0, 
Ao 
since both terms are negative. But then 


b 
fae) h' ( ae = ftA(0) — Ojh' (x) da + C f hi (a) dx 


= [t4@) — Ol h' (x) da < 0; 


THE CALCULUS OF VARIATIONS 107 


we see that for the given h(x) equation (11) does not hold, as re- 
quired. 


Problems. 1. Find the extremals and investigate the conditions for solu- 
bility of the extremal problem for the following functionals: 


+1 
(a) f Vy(i+y?) da, y(—1)=y(1)=b>0; 
i 


b 


2 
(b) f =H dz, y(a)—4, y(@)=B. 


Answer. (a) One solution for 6 = 1, two for 6 > 1 (parabolas), and none 
for6 <1. 
(b) Always one solution of the form y = sinh (C, x + C,). 
2. Analyse the extremal problem for the functionals: 
1 


(a) fy'dx, y(0)=0, y(1)=1, 
0 
ale 

(b) fyy'dx, y(0)=0, y(1)=1. 
0 


1 
(c) f yy’ dx, y(0)=0, y(l)=1. 
0 


Answer. In cases (a) and (b) the value of the functional is independent of 
the choice of the function y(x). In case (c) the variation of the functional 
does not vanish for any curve joining the given points; there is no ex- 
tremum. 

3. According to Fermat’s principle, light travels in such a way that it 
traverses the distance between points A, B, in the least possible time. 
Assuming that the velocity of light in the earth’s atmosphere varies linearly 
with altitude, find the form of a light ray. The curvature of the earth’s 
surface may be ignored. 


Answer. A circular arc. 
4, Find an extremal of the functional 


1 
F(y)= fe’ tany’dz, y(0)=0, y(1)=1. 
0 


Answer. y = x. 
5. Prove the following generalisation of Du Bois~Reymond’s lemma: 
If A(x) is a continuous function and 


b 
f Ay) Wh (x) da =0 


108 MATHEMATICAL ANALYSIS 
for any function h(x) € D, (a, b) that vanishes at 2 = a and at x = b together 
with its derivatives up to order n — I, then A(z) is a polynomial of degree 


<n. 
Hint. Put 


ni) = (a), (a) = ff pieyae = [ ple) (@—En-4 ae 


(Dizichlet’s formula). The hypothesis of the theorem can now be expressed 
as follows: the integral 


b 
f Al) p(x) daz 
a 
vanishes for any function p(x) for which 


b 
f ak p(x) dx =0 
a 
k= 0,1, ....2 — 1. Then apply the result of problem 3, Section 9, Chapter IT. 


b 
4, FUNCTIONALS OF THE TYPE ff f(x, y, y') dx (continued) 


1. Conditional extrema 


Apart from conditions of the form y(a) = y(b), problems in the 
calculus of variations sometimes involve additional conditions of 
the form 


b 
Gy) = foe, y,y')dx=C (where C is a given constant). 
a 


Such a problem is that of Dido, which seeks the curve y = y(), 
y(a) = 0, y(6) = 0, that for a prescribed length Z > b — a bounds 
the greatest area. Here the subject of investigation is the extremum 
of the functional 


b 
Fiy) = fyde« 


under the restrictions y(a) = y(6) = 0 and the supplementary 
condition 


b ee 
Gy) = f V1 + y%de =L. 


THE CALCULUS OF VARIATIONS 109 


The problem can be formulated abstractly as follows: find an 
extremum of the differentiable functional F(y) at a curve of the 
manifold determined by the equation 


Gy) =C, 


where Gy) is some other differentiable functional. 

For the solution of this problem we make the additional assump- 
tion that the required point is not a stationary point of the func- 
tional G. In general, therefore, a separate study must be made of 
the stationary points of G. Fundamental to the solution of the 
problem is the fact that, while the variation of the functional F 
must vanish as before at the required extremal point, it no longer 
has to do so for all possible displacements h, but only for those h 
that yield the invariant value of the functional G. More precisely, 
we make the following assertion: at the required extremal point, 
any vector h that satisfies the equation 


dG(y, h) = 0, 
must also satisfy the equation 
bF (y, h) = 0. 
Let us suppose, on the contrary, that for some h = hy we have 
dG (y, ho) =0, OF (y, hy) =A+0. 
Then for any ¢, |t] <1, it will be the case that 
6G(y,th) =0, OF ly,th) =tA+0 for t+0. 
In general the displacement th, will lead us offthe surface G(y) = C, 
and we shall have G(y + th) += C and with the aid of a suitable 


number s we modify the displacement i, so that the new dis- 
placement th) + sh, satisfies the equation 


Giy +the + sh) =C. 
It can be shown that for all sufficiently small ¢ the number s 
exists and is an infinitesimal of higher order than é, so that s/t > 0 


ast —> 0. 
For let us write 


Gly + tho + sh) = Gly). 
Since G is a differentiable functional, we have 
Gly + tho + 8hy) = Gly) + OG(thy + shy) + r(éhyg + 5h), 


110 MATHEMATICAL ANALYSIS 


where 7(h) is an infinitesimal of higher order than |h|. And since 
OG (ho) = 0, 6G(h,) = b + 0, we get 

K(s,#)=sb + r(th) + sh) =0. (1) 
The function r(t hy + sh,) is differentiable with respect to ¢ and s 
together with G(y + thy + sh,), and so, in accordance with the 
hypotheses, 

Or(thy + 5 h,) 
at 


Or (thy + 8 hy) 


= 0. 
say ds 


s — 
t t=0 

We see that the function K(s, t) satisfies the conditions of the 
implicit function theorem ; from equation (1), s can be expressed as a 
well-defined function of t, vanishing at t = 0. This function of ¢ is 
differentiable, 

and so, 

0k 


s=0 3 QT 
t=0 0s 


ds 


ds _ OK 
dt 


ie ot 


showing that s is an infinitesimal of higher order than ¢. 
We have further that for the functional F 


= tdF(y, ho) + s6F (y, ky) + -- 
=ta+t-. 


where dots stand for infinitesimals of higher order than ¢. This 
expression obviously has differing signs for positive and negative t 
(sufficiently small in absolute value); hence when G maintains the 
constant value G, the functional F cannot have an extremum at 
the point y. 

Our assertion is therefore proved. We now deduce from it a rule 
for determining the required extremal point. 

For this we employ a simple lemma from the general theory of 
linear functionals. 


Lemna. If a linear functional F(h) vanishes for any vector hy for 
which another linear functional G(h) vanishes, then F (h) is proportio- 
nal to G(h): 

F(h) =AG(h) (A fixed) (2) 


Proof. If @(h) =0, then F(h) =0 and equation (2) holds for 
any J. Let G(h) + 0 and let hy be a vector for which G(hy) = b + 0. 


THE CALCULUS OF VARIATIONS lil 
Then for any 4, a number ¢ can be found such that 
G(h — th) = G(h) —tG(h,) = 0. 


This equation is obviously satisfied by t = G(h)/G(hy). By hypo- 
thesis 


F(h —th) =0, 
so that 
F(t) = EF (ig) = G0) EX = 240), 
where 
1 Flt 
Gh) 


and the lemma is proved. 

We have seen that the linear functional 6F(y, h) vanishes for 
any vector h for which the functional dG (y, h) vanishes. By the 
lemma just proved, we have 

OF =106G, 
or 

é(F — AG) =0. 

Hence the required extremal point y is determined as one for 
which the functional 

H=fF—jJG 
for some (unknown) / has a stationary value (on the whole space). 
If we have a method for finding y from this condition, we shall 
get a stationary point y(A) for each A; but the only relevant A 
are those for which the corresponding point y(A) satisfies the 
equation 

G(y(a)) = C. 

The rule obtained is analogous to the well-known principle of 
Lagrange multipliers in the theory of conditional extrema of func- 
tions of several variables. 

Example. Find an extremum of the functional 


b 
Fiy) = fyVl+y?de 
subject to the conditions y(a) = y,, y(b) = y%, and 


b 
Gly) = [Yl +y?de =C. (3) 


112 MATHEMATICAL ANALYSIS 
Here 
b _———__———_— 
H=F-1G=f (y—-A V1 +y?de. 


Putting y — A = z, we get the extremum problem for the functional 


b ———_—_— 
H(z) = fey + 2'2da 


subject to the conditions z(a) = yg — A, 2(b) = y, — A. The solu- 
tion, as we know, is an are of a catenary; the number 4 can be 
determined from condition (3), which fixes the length of this arc. 
The functional G(y) has only one stationary point; it is the closed 
interval formed by the straight line joining the points (a, y,), 
(0, y,), and its length J is the corresponding value of G@(y). The 
problem evidently becomes meaningless with the condition 
Gy) =. 


Problems. 1. Solve Dido’s problem of finding the curve y = y(x), y(6) 
= y(a) = 0, of prescribed length L < b — a, that together with the closed 
interval a S a < b bounds the greatest area. 

Answer. The arc of a circle. 


2. Find the closed curve of prescribed length Z that encloses the greatest 
area. 

Hint. Use polar coordinates. Show that Euler’s equation for the functional 
F — AG implies that the required curve has constant curvature. 

Answer. A circle. 


8. Find the solid of revolution of prescribed axial cross-sectional area that 
contains the least volume. 
Answer. A cylinder. 


4, Find the solid of revolution of prescribed lateral surface area that con- 
tains the greatest volume. 
Answer. The solid of revolution of a circular segment about a chord. 


5. Differentiable functionals G,(y), ..., G,(y) are said to be independent 
at a point y, if their variations 6 G,(y, )), ..., 6 G,(Y, 2) are linearly inde- 
pendent. Show that y— {G,(y), ..., &,(y)} is a mapping of a neighbourhood 
of the point y, € # on to a neighbourhood of the point {G1 (yo), .--. Ga (Yo)} 
in n-dimensional space if functionals G,, ..., G, are independent at y,. 

Hint. Find n elements hy, ..., h, such that det ||@,(A,)|| +0 (problem 4 
of Section 9, Chapter II) and apply the implicit function theorem to the sys- 
tem of equations (&, given, ¢, unknown) 


GYo +h hy tot by h,) = G;(Yo) + g G = 1,2, ....”). 


THE CALCULUS OF VARIATIONS 113 


6. Show that the extremum problem of the functional 
Fy) = Ste y,y') dx 
with n supplementary conditions 
G1(y) -/ (2% ¥Y)=Cy—,  G,(y) = f In(X, ¥, y') da = C 


reduces, under the hypothesis that the G,(y) are linearly independent, to 
the extremum problem of the functional 


b 
F- Ay Gy ie al An G, = f [f (x, y, y') an Ay 91 (2, y, y') a ae An In (X, y, y')\da. 


Hint. Show by using problem 5 that at the required extremal point, the 
equation 6F(y, h) = 0 must hold for any displacement h that satisfies the 
conditions 5G,(y, h) = -- = 6 G,(y, h) = 0. Then use the result of problem 38, 
Section 9, Chapter II. 


2. Problems with Free End-points 


We now consider the case where the required curve y = y(z) 
is subject to boundary conditions of another kind, and instead of 
its end-points being fixed at points (a, y,), (b, y,), they can vary 
along a given curve. Such problems are of frequent occurrence in 
geometry and mechanics. 

We first consider the case where the left end-point of the required 
curve is fixed as before, but only the abscissa b of the right is 
fixed. 

As before the variation of the functional F(y) has the form 


oF (y, h) = Soh +t h'jda, 


but the function 4(x) is no longer obliged to vanish at x = b. 
Integrating the second term by parts, we get 


us woe flag oe = hy | hae. 


At the required extremal point the variation 6F(y,h) must 
vanish whatever the displacement function A(x). If we consider 
first only those displacement functions for which h(b) = 0, we find 


OF ly, h) = fy 


114 MATHEMATICAL ANALYSIS 
as before that the required curve satisfies Euler’s equation 


d 
hy — Gq hv = (1) 

ie. it is an extremal of the functional F. At the same time the 
variation of the functional reduces at the extremal point to the 
form 

OF (y, h) = fyr loan h(0). (2) 
Since (6) is arbitrary, the extremum condition 6 F = 0 reduces to 
the equation 

fyb, y (0), y' (0) = 0, (3) 
which must be satisfied by the required curve. 

Example. A variant on the brachistochrone problem (p. 102) 
consists in determining the curve y = y(x) along which a point 
mass, starting from rest at the origin of coordinates, must slide in 
order to reach the straight line « = 6 in the least possible time. 
We recall that the functional F(y) has the form 

b 


lay? 
V2gy 
0 
The extremals of F(y) that pass through the origin of coordinates 
are the cycloids 
x = C(O — sin 0), y = C(1 — cos 6). (4) 
It is easily seen that condition (3) reduces here to the form 
y'(b) = 0. 
We must therefore choose the cycloid (4) that is orthogonal to the 
straight line x = 6, ie. the one on which y is a maximum at 


x = b. The y-coordinate attains its greatest value when 0 = x, 


and so we get for 0 
b=Cux. 


Thus the required curve is 
b : b 
w= — (6 — sin §), y =— (1 — cos §). 


We turn now to the case in which the right end-point of the 
required curve is restricted to lie on a given curve y = b(x). In 


THE CALCULUS OF VARIATIONS 115 


this case the functional F(y) has the form 
é 
Fy) = { F@,yy')dz, 


where not only the function y = y(x) but also the right end-point & 
of the interval of integration has to be determined. As the normed 
linear space on which the functional F(y) is defined, we naturally 
take the space D, (a, 6) of all functions with continuous derivatives 
on the closed interval [a, 6], which comprises all possible positions 
of the point €. The increment in F(y), when the functional argu- 
ment y(x) is replaced by y(x) + h(x) can be written in the form 


é 
AFly,h) = fife ythy' +h) —fle,y.y)de+ 
EAE 
+ f fmythy +h)de. (5) 
é 


The principal linear part of the first term is evaluated just as in the 
preceding case: 


h(é). (6) 
g 


é 
bPiy, h) = [ (f, — pv) hae + hel 
a 
The increment A& of the abscissa €, the quantity A(&) and the 
gradient of the curve y = 6(x), along which the right end-point of 
the required curve is constrained to move, are connected by the 
relation (cf. Fig. 7) 


[b'(é) — y' (§)] AE = K(E), 


116 MATHEMATICAL ANALYSIS 


and hence Aé can be expressed linearly in terms of h (): 


___ he) 

bE) — y'() 
We shall assume that the fixed curve is not an extremal, so that 
5b’ (é) + y' (6). We can now find the principal linear part of the 
second term in equation (5): 


Ag 


h 
beF (y, h) = AE f(é, y(€), y' (€)) = 1S Te 
Adding (6) and (7), we get 


é 
fl 2 sehr) hae + [J + reo 


At the required extremal point the variation 6 F(y, 4) must vanish 
for any displacement h(x). If we consider first only those displace- 
ment functions for which 2(£) = 0, the second member vanishes as 
before and the required curve must satisfy Euler’s equation 


d 
ly ~ Gg lv = 0 


As before, this curve is one of the extremals of the functional F. 
For a general displacement function at the extremal point, we get 
the equation 


Hé.y,y')- (7) 


h(é). 


ao 


h(é)= 0, 


zat 


i 
lM + ayer 
which reduces, since # (€) is arbitrary, to the condition 
f+ 2! (@) — 9! @) fy lene = 0. (8) 


This relation imposes an additional restriction on the elements 
y(x), y' (x) of the required curve at a point of the curve y = b(zx), 
and it can now be fully determined. 

Example. What curve determines the minimum distance be- 
tween points A and B, where A = A (0, 0) is fixed, and B can move 
along a prescribed curve y = b(x)? 

Our functional here is 


é 
Fiy) = f Vl +y?de. 
0 


THE CALCULUS OF VARIATIONS 117 


The extremals of this functional are straight lines: the first stipu- 
lation of the problem is met only by those lines that pass through 
the origin 
y=ke. 
Condition (8) acquires the form 
—— k 
2, f ae ———=— = 
V+ 2+) — HR 


or, what is the same thing, 
ko'(x) = —1. 


This means that the required straight line y = k x must intersect 
the curve y = b(x) orthogonally. 

Condition (8) is sometimes referred to as the condition of 
transversality; the required extremal must intersect the given curve 


transversally. 

Note. We have considered only those cases where the right 
end-point of the required curve moves along a prescribed curve. 
If the same is true of the left end-point, we can show by the same 
procedure as above that the required curve is an extremal of the 
functional F, and that the condition of transversality must be 
satisfied by both the left and right end-points of this extremal. 


Problems. 1. If f(x,y, y') = A(x, y) V+ y"2, the condition of trans- 
versality reduces to the condition for orthogonality. 


2. Find the variation of the functional 
1 
F(y)= f py?dz 
0 


under the single condition y(0) = 1. 


Answer. 
OF (y, h) = fe yyth+ 2 yh dx 
: 1 
= f (yy? —2y% y')]h da + 29°(1) y'(1) A(). 
8. Find the variation of the functional 
FW) -f [y+ yz 


under the conditions y(0) = 0, y(a) = e779. 
MA. 5 


118 MATHEMATICAL ANALYSIS 


Answer. 


OF (ys W)= f [2y —2y"] h(a) de + 
0 
4. Find the variation of the functional 


Y? (Xo) + y'? (0) 


Beto — yay) 0): 


FYy)= fiy +y"Jex 


under the conditions y(%) = (2), y (x) = y(a). 


Answer. 
of (y—y") hn Ld +e) 
SF (yh) fu VI det eT haa) + 
Pe) +97) jen, 


i (®1) — y! (#1) 
5. Piecewise-~smooth extremals. Suppose that on the class of all piecewise— 


smooth curves y = y(x) with fixed values y(a), y(b), and an angular point (i.e.a 
point of discontinuity of the derivative) at some 2 = ¢, an extremum of the 


’ 
functional F(y) = f f(x,y, y’) da is realised at y)(z). Prove that yo(x) is 


a 
a solution of Euler’s equation for x < € and x > &, and that the expressions 


f Ys (% Yo(%)s Yo(@)) and f(x, Yo. Yo) ~ Yor ¥'(% Yos Yo) ave continuous at 
x = &, (the Weierstrass-Erdmann principle). 
Hint. Vf the angular point (&, ¥(&)) moves along a curve f = f(€), the 
é b 


variations of the components f and f must be compensatory. Use the 
a 


hE-0)  B)—y'(E-0) 
h(E+0) pe) —y' (E+ 0)’ 


which is evident from geometrical considerations. 


relation 


6. Principle of reflection of extremals. With the hypotheses of the pre- 
ceding problem, find a necessary condition for the irregularity of a piecewise— 
smooth extremal to lie on a prescribed curve y = 6 (x). 

Answer. The continuity of the expression /, (6 — y’) + f. 

7» Principle of refraction of extremals. The curve y = B(x) divides the 
plane into two parts A, B, one containing the point (a, y(a)), the other the 
point (6, y(b)). In the class of piecewise-smooth curves y = y(x) with a single 
discontinuity on the curve y = f(x), find the one for which the functional 


6 
' n_fo(%yy') for (a, ye, 
fiom y ax, Hemyr= {ee for (a, y) EB. 


attains an extremum. 


THE CALCULUS OF VARIATIONS 119 
Answer. The required curve in each of the regions A, B is the solution 
of the corresponding Euler equation. The condition 
Iy (BY — ya) +9 = hy (B’ — ys) +h 


is satisfied on the boundary line. 


5. FUNCTIONALS WITH SEVERAL UNKNOWN FUNCTIONS 


We consider a functional of the form 
b 
Fy) = | ite, Yr 9 Yn> Yr» veep Yn) Ax (1) 


on the linear space D” (a, b) of vector functions y = [y, (x), .., 
Yn (x)], defined on the closed interval [a, 6], and having continuous 
first-order derivatives; the norm in this space is given by the 
formula 


|| y|l = ma {]y¥1(x)|, e829: |Yn (x) |, |y's(x)|, sieiela | Yn (x) |}. 
asgrsb 


If the function { has derivatives up to the second order with respect 
to all its arguments, then, as we saw in Section 2 (p. 86), the 
functional (1) is differentiable on the space D{”, and its variation 
has the form 


b 


= (Eo of Of yy 
+ of h},|da. 
9Yn 


The displacement vector h here is the vector function [h, (2), .. 
h,(x)] in the same space D®. 

At the extremal point sought, the variation of the functional F 
vanishes for all h. In particular, if we set all the components of 
the displacement vector with the exception of one, h;(~), equal 
to zero, we obtain the equation: 


se 2 


6 
of Of yr S 


5* 


120 MATHEMATICAL ANALYSIS 


We shall solve the extremal problem on the linear manifold of 
vector functions y = [y,(”),..., Y,(x)] with prescribed boundary 
values 


y(a) = [y, (a), toes Yn(@)), y (b) = [y,(5), tery Yn(6)]. 


Then assuming that the required functions y,(x), ..., y,(x) are 
twice differentiable with respect to x, and applying the procedure 
of Section 3, we derive from (2) the Euler equation 


— E20. (3) 


The system of Euler equations (3) with j = 1, 2, ..., n is a system 
of m second-order equations in n unknown functions. The general 
solution of such a system contains 2 arbitrary constants C,, ..., Cog; 
by choosing them appropriately, we can determine the solution 
that satisfies the boundary conditions. 

We can, however, dispense with the existence hypothesis for the 
second derivative by adopting the same procedure as in Section 3 
with the DuBois-Reymond lemma. Only here, in place of the 

2 2 
5 i =: 0 we shall have det Wat +0. 

Example. Equations of geodesics. Let us suppose that the square 
of the differential of arc on an n-dimensional surface L is given by 
the formula so that the arc-length of a curve joining points A, B 


restriction 


n 
ds? = ; ax (u) du; du, , 


= 
it 
u 


is expressible in the form 
B 
S= FV Sajx(u) dey du,. 
A 


The coefficients a;;,(u) are assumed differentiable with respect 
to each of the arguments w,, ...,U,, and the quadratic form 
2d) 4% (u) du; du, is positive definite. 

We shall find the curves on which this functional has an extremal 
value. Taking the u; to be expressible as functions of a parameter ¢, 
we get the system of Euler equations 


1 dan tor d 1 ' 
= U; Up, — Siajyu, =0 (§=1,2,...,2), (4) 


THE CALCULUS OF VARIATIONS 121 


where g(u = 2 jr u; u,. This quantity becomes equal to x if ¢ 


denotes arc- suis as we now suppose. Then equation (4) becomes 


j tof po a 
9 dey Uj Up ~< Sanu =0 (Z = 1, 2, sony n). 


But on the other hand 


d F 0a; 
jlo oe wW 
— SJa;,u; = Uy, U; + DS) aj, U; 
jt > ke jt 
dt J J RnR Ou, 4 ao J 


1 0a; ij 0Ay 1 
=a [Eee + ae du; “I i) ie 


and the equations can therefore be written in the form 


04; 0a; 1 OAy} Pee 
Ou, OU 04; ) (5) 


Dain} =5 235 


Since the form g = 5) aj, u; uw, is non-degenerate, det || a;,,|| + 0, 
and equation (5) can be solved for the u;’. Denoting 


ra; -5 | 0a; 1 ae] 
é 2 Ou Ou, Ou; if 


we get a system of equations of the form 
i aa > Aim TF, uy Wye» 


Applying general theorems on the existence and uniqueness of 
the solution to a second order system, we conclude that through 
each point of the surface L (more precisely, through each non-singu- 
lar point, i.e. one at which the form g is non-degenerate) and in 
each direction there passes a unique geodesic. 

Equations of motion of a system of point-masses. Let a system 
of m point-masses m,, ..., m, be given. We denote the coordinates 
of the 7th point by 2;, y;, z;. It is well known that the motion of 
the system can be expressed by Newton’s system of equations 


mj % = Fig, my = Fiy, mj% = Fj, GG = 1,2, .., 2), (8) 


where dots indicate differentiation with respect to time, and 
F;,, F;,, Fj, are the components of the force F; acting on the jth 
point. We assume that the forces F; possess a potential function 
U = U (x4, Yys 21) +> Ens Yn> 2n); this means that the following set 


122 MATHEMATICAL ANALYSIS 


of equations holds: 


aU op, aU op 20 
02; OY; 02; 


Fj.=—- ie (7 = 1,2, ..., 2). 


The existence of a potential function U permits the evaluation 
of the work done by the forces acting on the system in displace- 
ments dx,, dy,, ..., d2z,, as a “potential difference”: 


2) Bye da; + Bjy dy + Fj.dy 
00 00 
= pl aa ie +5=-ds,| Eee 5 
As we know, the function 


mM; . ‘* . 5 . . . . . 
T= a3 =F ¥ ao 2) = T(x, Yrs 2%. ---> Uns Yn> Zn)» 


is termed the kinetic energy of the system. We introduce two 
important functionals, 


b b 
Te PE yy vis ta dts. Tg f OG ayy seep a) OE: 


Taking the initial position of the system 2,(a), ..., 2,(a@) and the 
final position 2x, (b), ..., Z,(6), to be fixed, we find the variations 
of both functionals J,, J,. Denoting the respective components 
of the displacement vector by 62, ..., d%,, we have 


b 
T.. 
5J, =f ES Grikoe + 3-8] dt 


or, integrating each term by parts, and remembering that 


6 x, (a) ims 6 x, (0) eS 6 z, (a) = 6 z,(b) =0, 
we get 
b 
ad oF d oT 
ee -f{lz jeeat +7 jp om [ae 


b 
= — fim aon, +--+ +m, 2,6 2,] dt. 


THE CALCULUS OF VARIATIONS 123 


Further 


b b 
bty= f [Foon t + 32 ba] dt = — [de + 


OX, Zn 
a 


Love + Fy, 8 Zp] de. 


In virtue of Newton’s equations (6), we have 


dJ, =0ds, 
and therefore 
6(J, — J.) = 0. 


We see that at the functions 2,(é), ..., 2,(é), which describe the 
actual position of the system during the interval a <t S b, the func- 


tional 
b 


Jy Jy =f (2 — Uae 


a 


has a stationary value. 

The fundamental problem in mechanics of a system of point 
masses thus turns out to be a problem in the variational calculus. 

This fact was first discovered by W. Hamilton (in 1835), and the 
result given therefore bears the title of Hamilton’s variational 
principle. 

The function DZ =T —-U =D (a, ..., 2p, Ly, »--, %) is termed 
Lagrange’s function for the system under consideration. 

The motion of the system can often be expressed in terms of a 
smaller number of variables than the 3n functions 2, ..., 2n, 
according to the number of degrees of freedom (i.e. 3” less the 
number of independent constraints). If r is the number of degrees 
of freedom, the position of the system is determined by r para- 


meters—‘‘generalised coordinates” q,, do, ---, Y- In particular, all 
the rectangular coordinates of points of the system can be expressed 
in terms of the parameters ¢,, 92, ---) Yr: 


Xj = a (1, on Q)s | 
4 = YN, see Q)5 7 = 1,2, aaiy hs 
J zi (> 5G); | 


RX 
ll 


124 MATHEMATICAL ANALYSIS 


It follows that 
E 02; e - & Oy; B r 
t= Pie 5 =< peed ; a= eens 
: oy O"dx Ges Yi PF OO pall a 0% 


and hence the kinetic energy 
T My pr rr) 32 
mae 3 (aj + yf + 3) 


is some quadratic form in the “generalised velocities” q;: 


T= YK Yj qk» 
isk 


the coefficients of which are functions of the generalised coordi- 
nates. Similarly the potential function U(a,, ..., 2,) is a function 
of the generalised coordinates 


U = U(qy, «++ In)+ 


Lagrange’s function Z = T — U now emerges as a function of 


Wyo 009 Mer Its vo Ue 
The conditions for stationary values of the functional 


b 
uae, 
a 


can, as always, be expressed as Euler equations, and now assume 
the form 
OL d aL 


this is the so-called Lagrange system of equations of the second 
order. 

Since the kinetic energy of a system vanishes in a position of 
equilibrium, the conditions for equilibrium can easily be obtained 
from the equations of motion. Now the potential function is inde- 
pendent of the generalised velocities, and so we get as the condi- 
tions of equilibrium 

aug, 
0g; 
ie. an equilibrium position corresponds to a stationary value of 
the potential energy. 


THE CALCULUS OF VARIATIONS 125 


The equations (7) admit a first integral, termed the energy inte- 
gral. To obtain it, we multiply each equation by dq; = q; dt, and 
add the resulting equations, giving 


oT aU oT 
pot ag — pot ag, — 3 wa[5--] =0. 
dq 4 dq) 4 8a; 


. oT . oT oT .. 


and in virtue of the homogeneity of the form T in the g;, which 
gives 


Since 


oT 
7 89; 
we get 
oT 0U oT ,. 
dq; dg; —2d7' + 5)——dgq; 


=d7T —dU —2daT = —(dU+dT) =0, 


and consequently 
U + T = const. 


Thus the total energy of the system (the sum of the potential and 
kinetic energies) remains invariant during the motion. 

This last fact permits an easy proof of the following theorem on 
the stable equilibrium of a system: 

THEoREM (Liouville). If the potential function has a strict mini- 
mum at the point q° = (q9, ..., g°) then for any (sufficiently small) 
€ > 0, there will exist 6 > 0 such that, if a kinetic energy of magnitude 
less than 6 is imparted to the system when situated at rest at q°, ..., q°, 
then during the whole subsequent motion of the system the pownt 
dq = (G4, ++. GQ) will remain within the neighbourhood |q — g°| < «. 

Proof. Since by hypothesis the function U(q,, ..., g,) has a strict 
minimum at the point q°, there will exist a sphere |g — q®°| <« 
at the boundary of which the inequality 


U(hs 1) > UG, +) Ge) +4, 


is everywhere satisfied, 6 being a fixed positive number. If a kinetic 
energy 7’ <6 is imparted to the system when situated at rest at 
the point g°, then in the subsequent motion the total energy 
T + U of the system will remain constant and will not exceed 


MA. 5a 


126 MATHEMATICAL ANALYSIS 


U (gi, .--, 9°) + 6. But since for |g — g°| = ¢, U(q,, «--, g,) already 
exceeds the given magnitude, the total energy cannot fail to 
exceed U(q9, ..., g°) + 6, whatever the value of 7 (20). Hence 
the point (q,, ..., ¢,) must lie within the prescribed neighbourhood. 

A more searching analysis, which we shall not give heref will 
show that in a sufficiently small neighbourhood |g — ¢°| Sé it is 
possible to go over (by means of a linear transformation) from the 
coordinates (g,, --., 9,) to new coordinates (t,, ..., t,) so that, in 
the new coordinates, the equations of motion (with an accuracy 
up to infinitesimals of a higher order) have the form 


Th = & COS (W;, £ + o;,), 


where &, Wy, %, are fixed numbers (k = I, 2, ..., ”). 

Note 1. It is of interest to observe that the problem of the motion 
of a mechanical system with n degrees of freedom can be treated 
in terms of the motion of a point along a geodesic on an n-dimen- 
sional surface #! = const., taken with a specially chosen metric. 

For the functional corresponding to the kinetic energy can be 
expressed in the form 


frat =f ye—Oyrat= f YE — 0) Sanya, 


and under the condition # = const. the extrema of this functional 
and Hamilton’s functional are attained on one and the same curve. 

Note 2. Hamilton’s principle possesses a characteristic feature 
in that its formulation omits any assumption of finiteness in respect 
of the number of degrees of freedom. It can therefore be applied 
even to mechanical systems with infinitely many degrees of free- 
dom, and in particular, to problems with a continuous mass distri- 
bution, provided the potential and kinetic energies can be cal- 
culated for these systems. (The applicability of Hamilton’s prin- 
ciple can also be deduced in this case from the heuristic considera- 
tion that a continuous medium can be regarded as a system com- 
posed of a very large but finite number of separate particles.) We 
shall consider such problems in Section 8. Here we consider only 
one equilibrium problem, the form of equilibrium of a flexible 
inextensible string of prescribed length DL suspended by its ends. 
An element of string has mass u(x) ds, where ds is the element of 


} Cf. for example, G. Ye. Shilov, Introduction to the Theory of Linear 
Spaces, Prentice-Hall Section 76, p. 218. 


THE CALCULUS OF VARIATIONS 127 


arc-length and y (2) is the density. The gravitational force u(x) ds-g 
acting on this element has a potential function yu y g ds. The total 
potential energy of the string is as by the integral 


U= fac )y(x)gds =a fuvyiae + ytdn. 


The equilibrium condition is the condition for U to have a mini- 
mum. We also know the length of the string 


b 
fyi +ytda=L, 
a 


We have arrived at a problem on conditional extrema. In the case 
of a uniform string (u(x) = const.) its solution (Cf. Section 5) is 
the arc of a catenary. And so a uniform flexible inextensible string 
hangs in equilibrium along a catenary (whence the name). 


6. FUNCTIONALS WITH SEVERAL INDEPENDENT VARIABLES 


1. We consider functionals of the form 


du du 
=f ft (e nm se, so )axay () 
G 


on the space D, (@) of functions (x,y), defined on a plane (bounded) 
region G’, continuous, and having continuous first order derivatives 
with respect to each of the arguments x, y. The norm in this space 
will be given by 


Ox Oy 


Ou (a, | 


Jul = max | [us 9] aute)| 


If the function f(x, y, u, v, w) has continuous derivatives up to the 
second order with respect to u,v, w, then as we saw in Section 2 
(p. 84) the functional (1) is differentiable on the space D,(G) and 
its variation has the form 


of of of 
F(u, h) ff lsee + Sache + Sag h,| day: 
G 


Here h = h(x, y) is the displacement of the function u(a, y). At 
an extremal point the variation of the functional F(y) vanishes 


5a* 


128 MATHEMATICAL ANALYSIS 


for any displacement h(x, 4H 


JS an shh +h, ade 0: 


We transform this equation by integration by parts, assuming 
that the values of the function w(x, y) on the boundary J" of the 
region G have been fixed, and that the function h(x, y) conse- 
quently vanishes on J. For the term 0//du, say, we have 


By, By, j 
of 9) 6) 
Ot er ~ fre, Wr (Gan)ax 
Ay 
and hence 


af dof 
ff Sa! [fr 5G On. dz dy. 
fe Gc 
The next term is transformed similarly and we obtain the equation 


srinny~ {ff —2 (20) 


At 


0 / Of 
= ay( say) [oe yjdady =0. 
Since the function A(x, y) is arbitrary we have the result 
of 0 / Of af Of 
= 2 
ay az (aac) ay (aay) e (2) 


which is known as the Euler—Ostrogradsky equation. It is an 
equation in partial derivatives of the second order; the unknown 
function u(x, y) must be determined as the solution of this equation 
which satisfies the prescribed boundary conditions (the function 
u(x, y) is known on I’). The problem of determining u(x, y) from 
equation (2) subject to the given boundary conditions is known 
as Dirichlet’s problem. Just as with the corresponding problems in 
a single variable, Dirichlet’s problem may or may not have a solu- 
tion; for many important equations of the form (2), the existence 
and uniqueness of a solution are proved in the theory of partial 
differential equations. 

The Euler—Ostrogradsky equation can be formulated quite anal- 
ogously in the case of three or more independent variables. 


THE CALCULUS OF VARIATIONS 129 


2. Example 1. For the functional 
F(u) =f fo +ui)dady 
G 


the Euler—Ostrogradsky equation has the form 


02 u s Ou _ 
da? § Oy? 


0; 


its solutions are said to be harmonic functions. It is proved in the 
theory of partial differential equations that in this instance a solu- 
tion of Dirichlet’s problem exists (uniquely) for any region G with 
a piecewise-smooth boundary J’ and any continuous function 
u(x, y) assuming prescribed values on J’. Cf. also Chapter 5, 
Section 8. 


Problem. Find an extremum of the functional 
1a 
F(u)= f f e“" sin u, da dy 
0 0 


subject to the conditions u(z, 0) = 0, u(#, 1) = 1. 
Answer. u(2, y) = Y. 


Note. This problem has a unique solution although the boundary 
conditions are not prescribed on the whole boundary. 

The function u(x, y) can also be made subject to other boundary 
conditions (in addition to assuming fixed values). In these cases 
the required solution will again be a solution of Euler’s equation 
{an extremal) satisfying prescribed boundary conditions, but if it 
is not yet uniquely determined by these conditions, it will be sub- 
ject to additional conditions on the boundary, obtaining from the 
requirement that the variation of the functional vanish (as in the 
free end point problem for a simple functional). 

3. Example 2. We introduce the equation for small vibrations 
of a string. A string, placed in a position of equilibrium between 
points 0 and / on the x-axis, executes small vibrations about this 
position. It is assumed that each point moves only in a direction 
perpendicular to the x-axis. We denote by w(z, t) the configuration 
of the string at an instant ¢; for definiteness, we suppose that the 
end-points remain fixed, so that u(0,t) = w(l, t) = 0. The kinetic 
energy of the string, as the sum of the kinetic energies of the 


130 MATHEMATICAL ANALYSIS 


particles that constitute it, can be expressed by the integral 
L 
T = f ufiuda, 
6 


where dz is the mass of an element of string corresponding to 
an interval dx on the z-axis. The quantity w = yu (2) is the density 
of the string at the point x. A most important property of the 
string is its potential energy; strictly speaking, the expression 
for the potential energy actually defines the string from the 
mechanical point of view. A string is a one-dimensional mechanical 
system, the potential energy of each part of which is proportional to 
its extension relative to the position of equilibrium. Thus we have 


dU = p(x) [V1,+ u2 da — da], 


the coefficient g = p(x) figuring in this equation is called the mo- 
dulus of elasticity of the string (Young’s modulus). Assuming wu, 
to be so small that its fourth power can be neglected, we get 


Lagrange’s function L = T — U has the form 
l 
L= 4 f (uu? — purjda. 
0 
Hamilton’s functional if L dé will now be the double integral 
Lb 
af [lua — purjdadt. 
Oa 


And we can now write the Euler—Ostrogradsky equation as 


3 fc) 
— Fy (ute) + 5 (P tee) = 0. (3) 


THE CALCULUS OF VARIATIONS 131 


If uw and p are constants (i.e. the string is uniform in respect of 
density and elasticity, we obtain the equation 

Lt — Pug, = 0, (4) 
in which we are interested. 

The boundary conditions arise naturally from the physical cir- 
cumstances and here we can take them as follows: the values of 
the function w(x, 0) are prescribed for ¢ = 0 (i.e. the initial con- 
figuration is known), and similarly the values of the function 
u,(x,0) are given (the initial velocity of each point is known). 
We shall show that they determine just one solution of equation (3). 
If there were two solutions wu, (x, t), w,(x, t) for the equation of the 
string, taking the identical values u,(x,0) = u(x, 0), u,4(x, 0) 
= Ug;(%,0), then their difference w(x, t) = u,(x, t) — u,(z, t) 
would also be a solution satisfying the null conditions u(z, 0) 
= 0, u,(z, 0) = 0. We must show that u(x, t) = 0. To do this, we 
make use of the following consideration. The total energy of the 
string b 
Bats U= sf leak + pulls 

a 


must remain constant throughout, just as in the case of a finite 
system of point masses (we give a strict proof of this below). But at 
the initial instant, by hypothesis, u,(x, 0) =0, u(x,0) = 0, so that 
Uz (x,0) = 0, and consequently at t = 0, we also have H = 0; 
but then H = 0 at any instant, and so u, = u, = 0. It follows 
that u(x,t) is a constant, and since w(x, 0) = 0, u(x, t) = 0 for 
any t. 

It remains to verify the law of conservation of energy for a string. 
The proof preceeds analogously to the case of a finite system of 
point masses with sums replaced by integrals with respect to z. 
We multiply the string equation (3) by u, and integrate with 


respect to x: 
t 


Uy pa) een dz =0. 
F Ou ot 


We integrate the first term by parts: 
t l 


l 
= f tea pty de. 
0 
0 


4 


6) 
fu Ox (p Uz) du = Uy P Uy 
0 


132 MATHEMATICAL ANALYSIS 


The term under the integral sign vanishes, since u,(0, t) and u, (J, ¢) 
vanish together with u (0, t) and u/(J, t). Thus we have 


i I 
0 1 a) 

f [ge Gd + pms ma] de =5f lyre) + 

6 0 


0 
+57 (pd) dx =0, 


and so 


l 

qb od eee 

af we + puz]dx =0 
0 


from which it follows that 


EH = const. 
as required. 

We show now how to construct the actual solution, at least for 
sufficiently smooth initial functions p(x) = u(x, 0), p(x) = ue(2,0). 
For simplicity we suppose that p = 1, uw = 1, 1 = a (the general 
case is considered in Chapter 5, Section 5). It is easily verified that 
for any integer n the functions sin n 2 cos 7 é, sin x x sin 1 t satisfy 
the equation 

Wt — Ure = 9 (5) 


and the boundary conditions (0, ¢):= u(I, t) = 0. We form the 
series 


fo.0) 
u(x,t) = S)sinn x(a, cosnt + 6, sin nt), (6) 
1 
the coefficients of which are determined from the conditions 


u(x, 0) = Dd’ a, sinn a = g(x), 
1 

u,(xz, 0) = Dd) nb, sin nx = y(zx). 
1 


If the functions g(x), y(x) are sufficiently smooth f, the coefficients 
a,,, 6, will tend to zero sufficiently rapidly to ensure the absolute 
convergence of the series (6) together with its first and second 


+ More precisely, if the odd periodic continuations of p(x), p(x) over the 
whole axis with period 22 are sufficiently smooth. 


THE CALCULUS OF VARIATIONS 133 


derivatives with respect to x and t. We can then evaluate u,, 
and «w,, by summing the corresponding derived series; since 
equation (5) is satisfied for each term that occurs, it will also be 
satisfied for the sums. A more detailed account of this question 
and an analysis of cases in which p and yu are not constant would 
take us beyond the limits of our course.f 


Problem. Find the law of vibration of a string fixed at its end-points 
z=0,%=7if p= pu=1 and u(z, 0) = 0, u,(z, 0) = sin 2x cos x. 

Answer. u(x,t) = 1/2 sin x sin t + 1/6 sin 3z sin 34. 

4, We consider a further case of constrained motion, in which a 
string is acted on by an external force. Let a force f(x, t) A x act 
on an element A x of the string. This force possesses a potential 
function (the work done between 0 and x). 


l 
U; = — ff, t) u(x, t)hdz, 
0 


and the total potential energy can therefore be expressed by the 


formula 
H 
v={\5 : jul ax; 
= 3 Us 
0 


Euler’s equation will now assume the form 
Plz, — by = f(x, t). 


If the external force is actually independent of time, so that 
f(x, t) = f(x), we can derive the form of the equilibrium position 
from this equation. In equilibrium u,, = 0, and the configuration 
of the string u = u(x) therefore satisfies the equation 


Puce = f(x). 


We determine the solution of this equation that interests us by 

means of the boundary conditions at the points x = 0,2 = 1. 
Thus a uniform string (uv, p = const.) hangs under gravity 

(f(%) = yg) in the form of a parabola satisfying the equation 


Puzrs = "G- 


+ Cf. for example, I. G. Petrovsky, Lectures on Partial Differential Equa- 
tions, Interscience, 1955; R. Courant and D. Hilbert, Methods of Mathe- 
matical Physics, vol. 1, Chapter V, Interscience, N.Y. (1953). 


134 MATHEMATICAL ANALYSIS 


5. Example 3. The equation of small vibrations of a membrane. 
A membrane is the “two dimensional analogue” of a string: in 
other words, it is a mechanical system in the form of a surface, 
the potential energy of each part of which is proportional to its 
increase in area relative to the equilibrium position. Thus if a 
function u(x, y, t), defined on a region @ in the (z, y)-plane for 
t = 0, describes the configuration of a membrane at an instant ¢, 
then the expression for its potential energy assumes the form 


U=ffpv +u2 +u2 —1)dedy=4 ff p(u2 + u2)dady. 
é G 


The kinetic energy has the form 
=4f ful, y) ubdady. 
G 


Lagrange’s function becomes 
L=T-U=4f {plus + uj) — pufldedy. 

G 

Hence the Euler—Ostrogradsky equation assumes the form 

0 0 7) 
oy (Pte) + ap — py He &) = 0. 
For yu, & constant, an equation of the form 
Up, = C2(uZ + ui) 


is obtained. As in the case of a string, we can prescribe u(z, y, 0) 
and u,(x, y, 0) as initial conditions. The subsequent theory follows 
a course basically parallel to that of the string; we again recom- 
mend to the reader the course in partial differential equations in- 
dicated above. 


7. FUNCTIONALS WITH HigHER DERIVATIVES 


1. A functional of the form 
b 
Fy) =fteyys ym) da (1) 


is defined on the space D,,(a,b) of functions y = y(x) with m 
continuous derivatives on the closed interval [a, b]. We recall that 


THE CALCULUS OF VARIATIONS 135 


the norm D,, (a, b) is given by 
leh ax x Mly(e Ys ly" )[, +5 Ly @) I}. 


If the function has Yo: Y1> «+» Ym) has derivatives up to the second 
order with respect to the arguments y), ..., ¥, continuous for all 
Yor «+> Ym, then, as was shown in Section 2 (p. 84), the functional 
(1) is differentiable on D,,,(a, b) and its variation has the form 


b 
af af af 
— eae t ade (m) 
[sper age Sa symm Jax. (2) 


The displacement vector h = h(x) is a function in the same space 
D(a, 6). We shall solve the extremal problem for the functional 
F(y) on the manifold of functions y(x) € D,,(a, 6) with prescribed 
values 


y(a) =a, y' (2) =a, .., ya) = a,-33 | 3) 


y(b) = bo, y'(b) = by, -, yO 9(b) = Ona. 
The function /(x) then satisfies the conditions 
h(a) =f (a) = = hea) = 0; 
h(b) = h'(b) = -- = h@-D(b) = 0. | 


Let us suppose that the required function y = y(z) has continuous 


(4) 


a 
derivatives up to order 2m. Then all the functions a that 


occur in (2) will, as functions of x, have continuous derivatives up 
to order m. We integrate each of the terms by parts (commencing 
with the second) as many times as are required to remove all deri- 
vatives of the function h(x). In virtue of (4), all the integrated 
terms vanish, and we have 


b 
_ pat od af , & af 
By, h) -{ls ~ dx dy’ aaa oy" 


hae iin | 
— + +(-1) aaa Fy] he ae. 
Since A(x) is arbitrary, the required function y must satisfy the 
equation 
of d af dad? Of a” of | 
oe Bee | =36 
oy dz dy’ da? ay" se oe dam Oy) 


136 MATHEMATICAL ANALYSIS 


This is an ordinary differential equation of order 2m, and it is 
also known as Euler’s equation. The general solution contains 2m 
arbitrary constants, which can be used in conjunction with the 
conditions (3) to determine the solution that interests us. 


Problem. Find the extremals of the functional 
b 
Fy) =f y'de. 


2n-1 
aou+ pf) "-1 4 yx2+46. Forn= we have 


(n— ( 
a2n(2n — 1) 
ot§y = pa + 6 — log(«x + f); form = 1 the value of the functional is inde- 
pendent of y. 


Answer. y = 


2. Problems sometimes occur in which not all the boundary 
conditions (3) are given, but a smaller number only, so that the 
general solution of Euler’s equation retains some free constants 
though it satisfies the boundary conditions. Such problems are 
similar to the free end-point problems for simple functionals 
(Section 4). For their solution we have to transform the variation (2) 
utilising the existing boundary conditions, and then obtain sup- 
plementary conditions on the boundary by equating it to zero. 

Example. Find the curve y = y(x) that determines an extremum 
of the functional 


b 
Fy) =f (Pde 


subject to the conditions y(a) = y(b) = 0. 
Solution. The variation of the functional F(y) has the form 
b 
OF (y,h) = fy" hi" dx. 

After one integration by parts the boundary terms remain, since 
the function h’ (x) need not vanish on the boundary, but a second 
integration produces no new boundary terms (h(a) = h(b) = 0). 
As a result we get 


b 
OF (y, h) = yl" (ae) hi (x) |? — fy!" (a) Bi (a) da 


b 
= [y!’(b) h'(b) — y" (a) B (a)] + fy!) h(@) de. 


THE CALCULUS OF VARIATIONS 137 


The given expression must vanish for the extremal function y (x) 
for any function h(x) € D,(a, 6) with h(a) = h(b) = 0. If in ad- 
dition h' (a) = h'(b) = 0, we get 


b 
f y¥(z) h(z) dx = 0, 


so that y!¥ (x) = 0, and y(z) is therefore a cubic parabola: 
y= a +a,% + a,x + ag25, 


Thus the integral term in the expression for the variation vanishes, 
and we obtain for the general case 


SF (y, b) = y"'(b) h'(b) — y'(a) h(a) = 0. 


Since h’ (a) and h’(b) are independent, we must have y’’ (a) = y''(b) 
= 0. These two conditions together with the conditions y(a) 
y(b) = 0 determine the required solution uniquely. 


Problem. Find the curve y = y(x) that determines an extremum of the 
functional 


1 
FY) =43 f y" dex 
Qo 


subject to the conditions y(0) = y’(0) = 0, y’(1) = 1. 
Answer. y = 1/2 2. 


3. Completely analogous considerations hold in the case of 
several independent variables. For simplicity we confine ourselves 
to a consideration of the functional 


Pu) = ff fy, U, tes Uys tees Unys Uyy) de dy 
G 


where the integration is over a region G in the plane of the variables 
x,y. The functional is defined on the space D,(G) of functions 
u(x, y) with continuous derivatives up to the second order on the 
region G. The function / is assumed to have continuous derivatives 
up to the second order with respect to all its arguments. The first 
variation of the functional F then has the form 


OF (u, h) = ff Uak + fuz he ane fy Ay =F furs hex + huey hey ST 
G 


+ fuyy My] de dy. 


138 MATHEMATICAL ANALYSIS 


The displacement vector h = h(x, y) is a function in the same 
space D,(G). At an extremal point wu = u(x, y) of the space D,(@), 
the variation 6 F(u, h) vanishes for any h(x, y), so that 


ff Uhh + tual + fay by + fee Ieee + fay hey + 
G 
+ fuy, yy] da dy = 0. (5) 


We shall consider the values of the functions u(x, y), uz, uy, on the 
boundary I of the region G fixed, with the consequence that the 
quantities h, h,, hy vanish on J’; we also suppose that the required 
function u(x, y) has continuous derivatives up to and including 
the fourth order. Then each of the terms under the integral sign 
has derivatives with respect to x and y up to the second order in- 
clusive. We integrate each term, beginning with the second, once 
or twice so as to remove all derivatives of the function h(x, y); in 
the process, all integrated terms vanish in virtue of the boundary 
conditions, and equation (5) becomes 


7) fC) 0? 0? 
Lf [te - gee — Gober + Gar lee + pay he + 
G 


02 


Since h(x, y) is arbitrary, the function w satisfies the 4th-order 
equation 


j a) 0 0? 3? 0 


i 


} { — my 
oy hu, 4 0202 Fuse 1 da Oy Fury 4 ay? fy = 0 


(6) 


(the Euler—Ostrogradsky equation). The unknown function w(x, y) 
must be determined as a solution of this equation satisfying the 
given boundary conditions. Questions as to the existence and unique- 
ness of solutions of such equations are considered in the theory 
of partial differential equations. Just as in the case of functionals 
with first-order derivatives, the fixed-value conditions can be 
replaced by others (Section 5). 

4. Example. The equation of small vibrations of a rod. A rod, 
placed in a position of equilibrium between points 0 and / on the 
x-axis, executes transverse vibrations in the (x, «)-plane. We 
denote by w(x, ¢) the profile of the rod at an instant ¢, and we sup- 


THE CALCULUS OF VARIATIONS 139 


pose that the ends 0, / are “hermetically sealed”, so that 
u(0, ¢) = w(l, t) = 0, 
Uz (0, t) = u,(U, #) = 0. 


The kinetic energy of the rod, like that of a string, is expressed by 


the integral 
t 


T= 4 { w(e)upde, 
0 


where (x) dx is the mass of an element corresponding to an inter- 
val dx. In contrast with the potential energy of a string, that of 
the rod is determined, not by the extension, but by the deformation 
of the profile; more precisely, the rod is determined as a mechanical 
system by the property that the potential energy of each element is 
proportional to the square of the curvature of the profile: 


k(x) Use 
2 (1+ u}) 


Supposing u, and u,, so small that we can neglect u? u2,, 
write dU in the simpler form 


dU = de. 


we can 


k(x) 


dU = 5 


Ue dx, 
giving for the potential energy 
Us= b [kus de. 
0 
Lagrange’s function L = JT — U has the form 
L=T-Us= flew — ku2,jdz. 
é 


Hamilton’s functional is therefore represented by the double inte- 
gral 


fra = bf [wud — bed,jazae. 
ty iy 0 


We write down the Euler—Ostrogradsky equation (6) for the pre- 


sent case: 
2 


6) ) 
~ op HU) — Baa (k trx) = 0. 


140 MATHEMATICAL ANALYSIS 


For constant pu, k, the equation becomes 
ee an ke Uzere — 0, 

which is the equation of free vibrations of the rod. For the initial 
conditions, which are implicit in the physical situation, we can 
take the following: the values of the functions u(x, 0) (the initial 
configuration) and u,(x,0) (the initial velocities of the points of 
the rod) are preassigned. As in the case of a string, the uniqueness 
of the solution to the problem subject to the initial and boundary 
conditions follows from a consideration of the energy integral. The 
actual construction of the solution subject to various boundary 
conditions is elucidated in courses on partial differential equations. + 

By the same reasoning as for the string, the equation of forced 
vibrations of a rod will have the form 


Q2 
pp (Hm) + Bae (h Ure) = I (2, t), 


where f(x, ¢) da is the force acting on an element dx. When the 
external force is independent of time, so that f(x, t) = f(x), the 
equilibrium configuration of the rod is determined by the condition 


2 
In particular a uniform rod (u, & = const.) hangs under gravity 


(f(x) = wg) along some fourth-order curve. 


Problem. Find the equilibrium configuration of a uniform rod: (a) herme- 
tically sealed; (b) freely supported at the points x,, , = +1 (Fig. 8). 


"NTI Be 


Fie. 8 


Hint. In case (b) the boundary conditions are u(—J) = u(l) = 0, the 
derivatives x’ (1), u’(—1) remaining free. 
Answer. (for k = uw = 1): 


(a) u(@) = ye (2); 
(b) a(x) = $ (a? — I?) (a? — BI). 


+ Cf. for example, A. N. Tikhonov and A. A. Samarskii, Hquations of 
Mathematical Physics. Pergamon, London, 1963. 


THE CALCULUS OF VARIATIONS 141 


Concluding remark 


The calculus of variations emerged as an independent branch 
of mathematics in 1744 or thereabouts with the discovery of a 
general method for solving variational problems by Leonard Euler 
(1707-1783, Swiss by birth, he spent the greater part of his life 
working in the St. Petersburg Academy of Sciences). Euler’s 
method was the forerunner of current “direct methods” in the 
caleulus of variations. The variational method, of which we have 
given an account, was first proposed in 1755 by J. Lagrange 
(French mathematician, 1736-1813) in a letter to Euler. In the 
development of the classical calculus of variations, the greatest 
mathematicians of the nineteenth century, Gauss, Poisson, Ostro- 
gradsky, Weierstrass, and others, participated. We have only been 
able here to sketch the elements of this wide sphere of mathematics, 
rich as it is in applications. For a deeper acquaintance, we can 
recommend N. J. Akhiezer, The Calculus of Variations Blaisdell, 
New York, 1962), and Calculus of variations by I. M. Gelfand and 
C. V. Fomin Prentice-Hall, Englewood Cliffs, N. J., 1963). 


CHAPTER IV 


THEORY OF THE INTEGRAL 


WE now proceed to extend the concept of the integral. The 
classical definition of the integral, as given by Cauchy and Rie- 
mann, while fully adequate for the case of continuous and piecewise 
continuous functions, turns out to be indequate from a more 
general point of view. Thus, we have seen that the space C; (a, b) 
of continuous functions f(z) on. the closed interval [a, b] with the 
metric 


b 
o(f.9) = f |f(@) — g(w)| de (1) 


is incomplete: there exist fundamental sequences which have no 
limit in the space. Nothing would be gained by adding discontinuous 
functions which are Riemann-integrable to the space C,(a, b). It 
is only by constructing a new integral, of greater scope than the 
Riemann integral, that we can obtain a class of functions that will 
yield a completion of the space C, (a, b) under the metric (1). 

Another problem for the solution of which the old definition of 
the integral is insufficient is that of describing a class of pairs of 
functions g(x), F(x), sufficiently wide for the formulae 


F(x) = Fa) + f p(é) dé, (2) 


F' (x) = (2x) (3) 


to be equivalent. The solution of this problem will be given in 
Chapter VI. 


1. Sets or MkasuRE ZERO AND MEASURABLE FUNCTIONS 


We shall begin the theory of the integral by investigating a class 
of sets on the closed interval [a, 6], which are called sets of measure 
zero. 


142 


THEORY OF THE INTEGRAL 143 


We shall see in a moment that they are the sets that can be 
neglected in evaluating integrals; more precisely, the integral of 
a function f(x) will be unchanged if the values of the function are 
changed arbitrarily on a set of measure zero. 

Definition. A set A contained in the closed interval [a, b] is said 
to be a set of measure zero if for any ¢ > 0 it can be covered by a 
finite or countable system of open intervals, the sum of the lengths 
of which do not exceed e. 

Sets containing one point, two points, or more generally any 
finite or countable aggregate of points are examples of sets measure 
zero. We give a proof of this. Let A = {x,, x, ...,}, be a countable 
set and let ¢ > 0 be a prescribed number; then a system of open 
intervals of lengths ¢/2, ¢/4, ...,¢/2,, .... covering the points 
Ly, Xq, ++, Xp, «» Yespectively will cover the whole set A and the 
overall sum of the lengths is not greater than «/2 + ¢/4 + --- 
+ e/2, + -+ =e. In particular, the set of rational numbers and 
the set of algebraic numbers are sets of measure zero. 

On the other hand, the whole interval [a,b] is not a set of 
measure zero. For by a well-known lemma of analysis, if a closed 
interval is covered by a countable system of open intervals, a 
finite covering can be extracted from the given covering; the sum 
of the lengths just of these open intervals certainly exceeds b — a, 
ie. the length of the whole closed interval. 

We can already explain why the values of a function on a set 
of measure zero are immaterial in the evaluation of its integral. 
It is enough to show that the integral of a function f(x) equal to 1 
on a set A of measure zero and zero on the complement of A must 
vanish. We cover A with a system of open intervals of total 
length < ¢. It is clear that if the integral of f(x) is well defined, it 
cannot exceed the sum of the areas of the rectangles of unit height 
with bases on the given open intervals. This sum is equal to the 
sum of the lengths of the open intervals themselves and is there- 
fore less than «¢, i.e. it can be made as small as desired. It follows 
that the integral of the function f(x) must vanish. 

We observe that in the definition we have given of a set of 
measure zero we can replace the open covering by a covering of 
closed intervals or any other covering (with no stipulations as to 
the inclusion or exclusion of endpoints). For if there exists a cover- 
ing of a set A by intervals of overall length < ¢, then replacing 
the nth interval by an open interval containing it, of length 


144 MATHEMATICAL ANALYSIS 


exceeding that of the mth interval by at most ¢/2", we obtain a 
covering of A by open intervals of overall length not exceeding 2¢; 
hence if A can be covered by some system of intervals of arbitrary 
small overall length, it also has an open covering of correspondingly 
small overall length, i.e. the set A has measure zero. 

We give a simple construction for closed sets of measure zero. 
Let us suppose that a closed set F on the closed interval [a, 6] is 
obtained by extracting from the interval an open set comprising a 
countable aggregate of disjoint open intervals 4,, 4,, ..., My, -- 
of overall length 6 — a. Then we can certainly assert that the set 
F has measure zero, since for a given « > 0 we can find n such that 


foe} 
b5; [Az <é. 
k=n-+1 

Here and subsequently we denote the length of an open interval 
A by |A|. The remaining x intervals A,, ..., 4, are disjoint and 
together with the intervening closed intervals Aj, ... 4;, (where m 
can equal » — 1, », or » + 1, including the case where A, and 
A, 1 have a common end-point so that A}, degenerates into a point) 
give a finite covering of the whole closed interval [a, 6]. Since the 
sum of the lengths of 4,, ... 4, exceeds b — a — «, the system 
Aj, --, 4, has an overall length <; and since it evidently 
covers the whole of F, we find that F is a set of measure zero. 

It would seem difficult to anticipate that after extracting from 
a closed interval of length b — aa system of disjoint open intervals 
of overall length also 6 — a, any substantial set can remain. 
Nevertheless it turns out that the set remaining can even be equi- 
valent (in terms of power) to the whole of the original closed 
interval. 

As an example we have the Cantor set on the closed interval 
[0, 1], with which we are already familiar (Chapter II, Section 4, 
art. 4). We recall how it is constructed. We first extract from the 
closed interval [0, 1] the open interval (1/3, 2/3) of length 1/3, which 
constitutes the middle third of the whole closed interval. We then 
proceed similarly with each of the two remaining closed intervals 
[0, 1/3], [2/3, 1], ie. we extract from each of them the middle 
third, namely, the open interval (1/9, 2/9) from [0, 1/3] and then 
open interval (7/9, 8/9) from [2/3, 1]. An analogous procedure is 
carried out with each of the four remaining closed intervals 
[0, 1/9}, [2/9, 1/3], [2/3, 1/9], [8/9, 1] and the process is continued 


THEORY OF THE INTEGRAL 145 


indefinitely. The set which results is closed and is termed the 
Cantor set. 

It is not hard to calculate the total sum of the lengths of the 
open intervals extracted: the first was of length 1/3, the next 
two totalled 2/9, the next four 4/27, and so on; the overall sum 
is the sum of the series 

1 2 4 Qn i 
GA WO ich ? eee wn eee 

By the criteria given above the Cantor set has measure zero. 

Further, the Cantor set has no isolated points, since the open 
intervals extracted in its construction had no common end-points. 
It follows in virtue of theorem 1, Section 4, Chapter II, that the 
Cantor set is uncountable; moreover in virtue of theorem 2 of the 
same paragraph it has the power of the continuum. 

We shall subsequently make frequent use of the following pro- 
perty of sets of measure zero: 

Lemma. The union of a finite or countable aggregate of sets of mea- 
sure zero is a set of measure zero. 

Proof. We consider at once the case of a countable aggregate 
A,, ..., An, ... of sets of measure zero. For a given ¢ > 0 and for 
each n, we cover the set A, with a countable system of open inter- 
vals of overall length less than ¢/2” (n = 1, 2, ...). Then the whole 
set A = A,+--+A,+ + will be covered by a countable 
system of open intervals (the sum of a countable set of countable 
sets) of overall length less than e. Consequently 4 has measure zero, 
as required. 

A set on the closed interval [a, 6], complementary to a set of 
measure zero, is said to be a set of full measure. The set of irrational 
numbers and the set of transcendental numbers are examples. 

The intersection of a finite or countable system of sets of full measure 
is again a set of full measure. For, if Q,,Q2, ... are sets of full 
measure and A, = CQ,, A, = CQz,, ... are complementary sets of 
measure zero, 


CHG = SEG = 34; 


has measure zero by the lemma; it follows that [7 Q; is a set of full 
measure, as required. 

If some property belongs to all the points of a set of full measure 
on the closed interval [a, b], it is said to hold for almost all points 


146 MATHEMATICAL ANALYSIS 


of the interval. For example the property that ¢ is irrational holds 
for almost all points & € [a, 6]. There exist functions which are 
continuous almost everywhere, i.e. continuous at every point 
except, perhaps, on a set of measure zero. In the case of functions 
which are permitted to assume infinite values, a sense can be attri- 
buted to the expression “‘finite almost everywhere”; this means 
that the set on which the function is infinite is at most a set of 
measure zero. 

We can now describe the class of functions in which our sub- 
sequent work on the definition of the integral will originate. The 
functions which belong to this class are termed measurable func- 
tions. A function is said to be measurable if it is defined and finite 
almost everywhere on a closed interval [a, 6] and can be repre- 
sented as the limit of a sequence of step functions which converges 
almost everywhere. A step function in turn is a function which 
assumes some constant value on each open interval of some parti- 
tion of the closed interval [a,6] by points a =a)<4,< + <a, =b. 
We can disregard the values of a step function at the partitioning 
points, since these are finite in number and therefore constitute a 
set of measure zero. 

The totality of step functions is a linear space under the usual 
operations of addition and scalar multiplication; if h, & are step 
functions, any linear combination « h + B k of them is also a step 
function. We easily infer that the measurable functions also form 
a linear space. For if the step functions h, converge to a function f 
everywhere except on a set A of measure zero, and the step func- 
tions k, to a function g everywhere except on a set B of measure 
zero, then the step functions « h, + B k, converge to the function 
ao f + Bg everywhere except on the set A + B, which, as we showed 
above, is also a set of measure zero; the function « f + f g is there- 
fore also measurable. 

Many other properties possessed by the class of step functions 
can also be carried over by a suitable limiting process to the class 
of measurable functions. We enumerate some of these. 

The product of two step functions is a step function; the product 
of two measurable functions is accordingly a measurable function. 

The quotient of two step functions is a step function, provided 
the denominator is non-vanishing. The quotient of two measurable 
functions is accordingly a measurable function provided the de- 


THEORY OF THE INTEGRAL 147 


nominator is non-zero almost everywhere. For if h,, > f everywhere 
except on a set A of measure zero, and k, > g everywhere except 
on a set B of measure zero, then replacing zero values of the func- 
tions k,, by the values 1/n, if necessary, we get a new sequence of 
non-vanishing step functions k, which converges to g everywhere 
except on the set B; but then the step functions h,/k, converge 
to f/g everywhere except on the set A + B -+ C, where C is the 
set of measure zero on which g vanishes. The set A + B + C has 
measure zero, and hence the function f/q¢ is also measurable. 

The absolute magnitude |h(x)| of a step function h(x) is a step 
function. It follows easily that the absolute magnitude of a mea- 
surable function is a measurable function. 

If two step functions h, k are given, then 


hy (x) = max {h(x), k(x)}, ky (wv) = min {h(x), k(x)} 


are also step functions. In the passage to the limit we get that for 
two measurable functions f, g 


max {f(x), g(x)}, min {f(x), g(x)} 


are also measurable functions. 


pauarea 9(x) 


max{F(x),g(x)} 
—~——min {F(x),9(x)} 


Fie. 9 


In particular, the positive part f* (x) = max {f (x), 0} and the nega- 
tive part f- (x) = max {0, —f(a)} of a function f(x) are measurable 
together with f(z). 


148 MATHEMATICAL ANALYSIS 


We remark the relations 
Pe afs fle ars 
which occur frequently and hold for any function f(z). 
Problem. It is known that the sum of the lengths of the open intervals 


contiguous to a closed set F < [a, 5] is less than 6 — a. Show that F is not 
a set of measure zero. 


2. THE Cuass Ct 


We now set about developing the integral concept. We consider 
first a step function h(x), i.e. a function that takes on constant 
values b,, bg, ..., ..., 0, on each of a finite number of intervals 
A,, Ay, ..., A;, into which the closed interval [a, b] is partitioned 
by points @ = % <a, < --- <x, = 0b. The integral of this func- 
tion is naturally set equal to 


b k 
fr(w)dex = 2: 5l4il.- 
a 


For the sake of brevity we shall in future denote an expression of 
b 
the form f h(x) da by Ih. It is easily verified that the integrals 


of step fanétion possess the following properties: 
(a) I(hy + he) = Ih, + Ih, for any two step functions h,, h,; 
(b) I(ah) =a I(h) for any number «; 
(c) if hy Shy, then Th, SIh,; in particular [kh =[Oifh 20. 


The next two properties are less obvious; they will be proved 
shortly : 

(d) if a sequence h, =O is monotone decreasing (so that, 
hy(x) 2 h,(x) = ---) and tends to zero almost everywhere, then 
Lh, > 0; 

(e) if a sequence h,, = 0 is monotone decreasing and in addition 
Ih, > 0, then the sequence tends to zero almost everywhere. 

We shall henceforth denote a monotone decreasing passage to 
the limit by thesign \y, so that the notation f, \y f, say, denotes 
that the monotone decreasing sequence of functions /, (7) tends 
almost everywhere to the function f(x). The sign 1 will have an 
analogous meaning. 


THEORY OF THE INTEGRAL 149 


We prove property (d). We are given that h, \y 0, and we have 
to prove that Ih, \y 0. Here we cannot apply the classical theorem 
on the term-by-term integration of a convergent sequence of func- 
tions, since it presupposes the uniform convergence of the sequence 
to its limit. For the proof we proceed as follows. We denote by A 
the union of the set on which the sequence h, does not converge to 
zero and the countable set of all the points of discontinuity of the 
h,; this set is of measure zero. We cover it with a system of open 
intervals {4} of overall length less than a prescribed e > 0. With 
each of the remaining points x’ we associate a number n = n(zx’) 
for which the inequality h,,(z') < ¢ is satisfied, and an open interval 
A(e') which contains the point and on which the function A, is 
constant. The open intervals {4,} and {A (z')} together constitute 
a covering of the closed interval [a, 6], from which we can extract 
a finite covering. We denote these open intervals by A), ..., 4m; 
A, peel Os primes indicating those constructed from the points 2’. 
lf r is the greatest of the numbers mapped onto by the points 2’, 
then neither the function h, nor any of its successors in the se- 
quence h,, exceeds ¢ on the intervals 4}, ..., A, . On the intervals 
A,, .--; Mm, the sum of the lengths of which is less than ¢ by con- 
struction, these functions do not exceed a number M, the maximum 
of the function A, (x). Now it is clear that for the integral of the 
function h,(z) over the closed interval [a, 6], and for the integrals 
of all the succeeding functions, we get a bound of the form 


ITh,<Me+e(b—a). 


Since « can be taken arbitrarily small, we have that Ih, > 0, as 
required. 

We now prove property (e), the converse of (d). We are given 
that the non-negative functions h, decrease monotonely and that 
Ih, \\ 0. It is obvious, since the functions h, both decrease and 
remain positive, that they have some limit g(x) =0 as n > oo. 
We must show that g(x) vanishes almost everywhere. 

For any function g(x) = 0, the set F of all the points at which it 
is non-zero is the countable sum of the sets 


Fn = feral) 2-1. 


The expression on the right-hand side of the equation indicates 
that the set F,,, is the set of all points for which g(x) 2 1/m. If we 


MA. 6 


150 MATHEMATICAL ANALYSIS 


can show that in our case each of the sets F,,, has measure zero, 
then their sum F will also have measure zero. We can therefore 
restrict ourselves to a study of the set F,,. 

It is enough to consider the subset F,, of F,, on which all the 
functions h, are continuous (its complement in F,, is countable 
and therefore has measure zero). Since h,(x) 2 g(x), we have 
h,(«) = 1/m at each point of F,,. We fix the number n; then the 
intervals on which the function h,,(”) is constant and which corre- 
spond to values of h,, (x) greater than or equal to 1/m form a covering 
of the set F/,. Let 6,, denote the sum of the lengths of these intervals. 
Since evidently 

Ih, = 6, ae : 
m 
we get that 
6, =<mIh,>0 as n>oo. 


Thus, for sufficiently large n, the set F,, is covered by a system of 
open intervals of arbitrarily small overall length. F',, is therefore 
a set of measure zero, as required. 

We now proceed to extend the definition of the integral from the 
class of step functions to a wider class. 

First of all we recall the scheme by which the Riemann integral 
is constructed. For a function f(x) the procedure is as follows. The 
closed interval [a,b] is partitioned by points a = a <a, < ++ <a,=b 
into separate open intervals A,, ..., 4;,; we denote 

m; = inf f(x), M; = sup f(x) (j = 1,2, ..., k) 
eA; Edy 
and form the two sums (which naturally depend on the aggregate JT 
of the partitioning points of [a, b]): 


81 = DS; m;|A;j|, 


1) 
So = 5)M;|Aj). 


The first sum is called the lower, the second the upper. If the parti- 
tion [7 is replaced by a partition JZ' by adding further partitioning 
points, then 

soy Ss Si, Sy =< Sy. 


It follows in particular that sy, < Sy, for any subpartitions 
IT,, IT,. We consider further an arbitrary sequence of partitions 
IT,, H,, ..., IT,, ..., each of which is obtained by the addition of 


THEORY OF THE INTEGRAL 151 


new partitioning points to its predecessor. The corresponding lower 
and upper sums form monotone sequences which approach each 


other: 
8 SS Se Ss SS - S88, 5+ S& <8. 


Each sequence therefore has a limit: s, 7s, S, \y S, with s <8. 
It can be shown that the numbers s and S are independent of the 
choice of the sequence of partitions I7,, IT,, ..., IT,, ..., provided 
that as n increases the length of the greatest open interval in the 
partition JT, decreases indefinitely. The function f(x) is said to be 
Riemann-integrable if s = S; the value of the integral of f(x) then 
is taken as the common value of these limits. If s < S, the func- 
tion f(x) is said to be not Riemann-integrable. 

We now consider the functioning of this process in respect of 
step functions. Each partition JJ of the closed interval [a, 6} by 
points a =a Sa, S-- <2, = b determines two step functions 
hy (x), Hy (x); the first of these assumes the value m, on the open 
interval A;, the second the value M;. The lower and upper sums (1) 
represent the integrals of these step functions. The sequence of 
partitions J7,, I7,, ..., 7, ... determines two sequences of functions 
hn(x), H,(x), the first of which is increasing, the second decreasing. 
Further, let s(x), S(x) be the limiting functions of these sequences, 
so that 

hn(x) A 8(x), Hn(x) N S(z). 


Since in addition h, (x) < f(x) SH,,(x), we have s(x) f(x) = S(a). 
We claim that if the function f(x) is Riemann-integrable, then these 
three functions coincide almost everywhere. For the difference 
S(x) — s(x) is the limit of the sequence of non-negative step func- 
tions H,,(x) — h,(x). This sequence is monotone decreasing and. 
in the case that f(x) is integrable, the integrals of H,,(x) — hy (x) 
tend to zero. But then, by property (e), the sequence H,, (x) — h, (x) 
tends almost everywhere to zero. Hence if f(x) is integrable the 
functions s(x), S(x) coincide almost everywhere with each other 
and with the function f(x). We see that a function which is Rie- 
mann-integrable is the limit (almost everywhere) of an increasing 
sequence of step functions s,, (x) and of a decreasing sequence S,, (x); 
its integral is the limit of the integrals of the step functions that. 
constitute these sequences. 

Conversely if the functions s,,(x), S,,(x) converge almost every- 
where to f(x), then the decreasing sequence of differences 


6* 


152 MATHEMATICAL ANALYSIS 


S, (*) — 8,(x) tends almost everywhere to zero. Hence, by prop- 
erty (d) we have I(S,()) — 8,(%) \, 0. It follows that the numerical 
sequences s, = I(s,(")) 7s, S, = I(S,(z)) AS have a common 
limit; and this means that the function f(x) is Riemann-integrable. 

Thus a function f(x) is Riemann-integrable if and only if it is the 
limit (in the sense of convergence almost everywhere) of some in- 
creasing sequence of step functions s,,(”) < f(x) and simultaneously 
the limit of some decreasing sequence of step functions 8, (v) = f(x); 
the integral of { is then the common value of the limits of the inte- 
grals of the functions s,,(x), S, (2). 

This observation provides us with a footing on which to extend 
the definition of the integral to a wider class of functions. 

We introduce a class of functions which we shall denote by Ct: 
a function f(x) belongs, by definition, to the class C* if it can be 
represented as the limit (in the sense of convergence almost every- 
where) of a monotone increasing sequence of step functions 


h, 7 f, 


where the integrals of these functions are jointly bounded by the 
inequality 
Ih, SC. 


We show first that every function f in the class C+ is finite almost 
everywhere. Let H be the set of points on which f(x) = oo. We can 
presuppose that at each point of the set H all the functions h,, (x) 
are continuous and satisfy the relation h,(x) —-> co. We choose an 
arbitrary number NV; at each point of # the inequality 


hal) > N, 


is satisfied from some n on, so that H is covered by the countable 
sum of the sets of the form {x:h,,(x) > N} (n = 1, 2, ...). Each of 
these sets comprises a finite number of open intervals; hence their 
union is also a countable sum of open intervals: 
U=AP + «. +A +A 4 .. 4 A?) foe tf AML... 

A? ee 


Here 4, ..., AY denote the component open intervals of the set 
{x:h,(x) > N}; A®, ..., 42 denote the component open intervals 
of the set {x:hg(x) > N} — {w: h(x) > N}, so that A, ..., AM, 
A®, ..., A@ constitute the set {x:h,(z) > N}, and so on. We 


THEORY OF THE INTEGRAL 153 


evaluate the sum of the lengths of all these intervals. First we let 
Bx = JAP] + + YARD] + AP] + oe + JARs 


Ox is the sum of the lengths of the open intervals which constitute 
the set {h,,(x) > N}. We have by hypothesis 


6,N <Ih, SC. 


It follows that 6; < C/N; since this is true for any &, we conclude 
that the overall sum of the lengths of all the open intervals com- 
prising U does not exceed C/N. Since N can be taken arbitrarily 
large, we see that # can be covered by a countable system of open 
intervals of arbitrarily small overall length. This means that H is 
a set of measure zero, as asserted. 

In particular, every function f € C* is measurable (Section 1). 
We now define the integral of a function { belonging to the class C'+ 
by the formula 

If =limTh,, (2) 

n>o 

where h, is the sequence of step functions involved in the definition 
of the function /. Since the sequence of numbers I h,, is monotone 
increasing and bounded, a limit exists to the right; but we still 
have to show that it is independent of the choice of the sequence h, 
which defines the function f. To do this, we prove the following 
more general result; if h,, k, are step functions with jointly 
bounded integrals, and if 


In Aft, In7Ag, £39, 


almost everywhere, then 
lim Ih, <lim I ky. (3) 


h—>oo NM >o 
For the proof, we fix a number m and consider the decreasing 
sequence of step functions 
hin — Kn. 
Its limit h,, —g Sf —g <0; but then (A,, — k,p)* \ 0, and it 
follows from property (d) that I(h,, — kn)* \y 0; since I (Am — kn) 
< I(hm — kp)*, the decreasing sequence I(h,, — kn) = Thm — Ikn 
tends to some non-positive limit. We conclude that J h,, < lim I k,,. 
m—-> co 
Since this inequality holds for any m, the passage to the limit 
as m —> oo gives us (3), as required. Putting g = f, wegetIf <Ig 
and Ig <If by symmetry; it follows that Jf = Ig. Thus the 


154 MATHEMATICAL ANALYSIS 


integral of a function f € C+ is well-defined by formula (2). If 
7E€Ct,g€Ct,f Sg, thenIf <Tqg. 

In particular, every function f which is Riemann-integrable 
belongs to the class C*, and its Riemann integral, as the limit of 
its lower sums, coincides with the integral J f, defined by us as the 
limit of the integrals of the step functions s,, corresponding to those 
lower sums. 

We see that our definition of the integral has at least as broad 
a compass as Riemann’s. In fact it has a considerably broader 
compass. For example, Dirichlet’s function y(x), equal to 0 for x 
irrational and | for x rational, is not Riemann-integrable; but seen 
from our new angle, it vanishes almost everywhere and is therefore 
integrable, its integral being equal to zero. We could also give 
more complex examples, where a function integrable in the new 
sense but not Riemann-integrable cannot be reduced to a Riemann- 
integrable function by modifying it on a set of measure zero. 

In the passage to the limit some, though not all, of the properties 
of the integrals of step functions carry over to the integrals of 
functions of the class C'*. It is easily verified that: 

(a) The class C* contains the function f + g whenever it con- 
tains f and g and then 


If+g)=If+Ig. 


(b) The class C* contains together with a function f its scalar 
product by any number « = 0, and 


I(af) =a If. 


We note that in the class C'* it is not possible to subtract func- 
tions or to multiply them by negative numbers since we are re- 
stricted to increasing sequences of step functions. 

(c) The class C* contains together with functions f, g 


min (/,g), max (f, 9). 


In particular, the function f+ = max (f, 0) belongs to C* together 
with f. (This cannot be said of the functions f-, |f|.) 

The following property shows that the class C+ is closed in 
respect of the limiting passage through increasing sequences of 
functions with bounded integrals: 

TuHerorem. If f, €C* (n = 1, 2, ...), fa Af, and If, < C, then 
fE€CtandIf =limlIfn. 


THEORY OF THE INTEGRAL 155 


Proof. For each of the functions f, we construct a definitive 
sequence of step functions: 


hy Sigs ee Shy Ss aes ie 
Noy Shes Ss Shen s ier hon 7 fo, 


ed 


i 


Further, we put h, = max (hin, .--) Ann). It is evident that h, 
is also a step function and that the sequence h, (n = 1, 2, ...) is 
monotone increasing. Moreover h, < max (f,, ---, fn) = fn, whence 
Ih, SIf, = C. Let f* = lim h,; by the definition of the class C*, 
we have f* € Ct and J f* = lim Ih,. But since h,, Sh, Sf, for 
any fixed k and n = k, we find, passing to the limit as n — oo, that 
fe Sf* Sf, so that f* =f (almost everywhere). Thus f € C*. 
Further, ITh,, SIh, SIf, SIf; since Ih, AIf* =If, we 
have I f, 71, which completes the proof. 

fee] 


Corotuary. If for a series 3} gr, Je € C*, gn = 0, the integrals of 
k=l 
the partial sums are jointly bounded, so that 


nr 
1( 39) = C, 
ke 
foe) Co 
then f = D/ 9, ts a function of the class C* and If = D/ Ig. 
kel ke 


For the proof it is sufficient to put f, = D/ g, and apply the 
k=l 


preceding theorem. 


Problems. 1. If f(z) differs from f,(x) of class C+ only on a set of measure 
zero, {(x) also belongs to Ct. 

Hint. The sequence h, (x) 7 f,(z) is convergent almost everywhere to 
f(x) also. 

2. Show that the Dirichlet function, equal to zero for x rational and 1 
for irrational, belongs to class Ct. 

3. A function equal to zero on a closed set F and 1 on its complement, 
belongs to class Ct. 

4, A closed set F exists, such that a function equal to 1 on F and 0 on its 
complement, does not belong to Ct. 

Hint. Take F as nowhere dense and not a set of measure zero. See the 
problem of Section 1. 

5. Show that a function /(x) is Riemann-integrable or differs from such 
on a set of measure 0 if and only if f¢ Ct, —f¢e Ct. 


156 MATHEMATICAL ANALYSIS 


6. Show that a function /(z) is Riemann-integrable if and only if the set 
of its points of discontinuity has measure 0. 


Hint. Ti s(x) 7f(x) and S,(z) \. f(x) and 2» is a point of continuity of 
all the step functions s,(z) and S,,(z), z) is a point of continuity of f(x). 
Conversely, s,,(%) 7 f(%o), Sa(%o) \. f(X) at every point of continuity 2). 


3. SUMMABLE FUNCTIONS 


1. In this paragraph we shall complete the construction of the 
integral by extending it from the class C+ to a wider class D in 
which all the natural functional operations can be carried out. 

We shall call swmmable (or Lebesque-integrable) any function » (x) 
(a <a <b) that can be represented as the difference 


p=f-g9 


of two functions of the class C+. We denote by L the totality of 
summable functions. In the class of summable functions the follow- 
ing operations can be performed: 

(a) Addition. If» = f — 9g, p, = fy — g, are summable functions, 
with f, 9, f,, g, functions of the class C*m, then 


gp+n=Ft+h)-o+o) 


and since f + f,,g + 9, € C*, it follows that the function p + 9, 
belongs to L. 

(b) Multiplication by any real number x. Ifx = 0, thengy = f — g, 
fEC*, g €C* implies oy =af —ag, of CC, ag EC, and 
consequently « » € L; andifa < 0, then —« > 0 and the equation 
ap = (—«)g — (—«) f shows that as before « p € L. 

It follows from (a) and (b) that any linear combination of func- 
tions of the class Z is also a function belonging to L. 

(c) Taking the modulus of a function. Let p =f —g, fEC*, 
g € C*, then max (/, g), min (/,g) also belong to the class C*; it 
follows that |y| = max (f, g) — min (f,g) belongs to the class L. 
Solving the equations 

Ce es 
lpl =e +e, 


we see that the functions g*, m- belong to the class LZ together with 
the function ¢. 


THEORY OF THE INTEGRAL 157 


Further, the equations 


max (y,y) = (p + )* —y, 
min (p, y) = —max (—¢, —y) 
show that the class Z contains together with functions 9, » their 
maximum and minimum. 
2. We now give the definition of the integral in the class L. With 
the decomposition 
Q= f-g, 
fec*, (1) 
geC*, 
we put 
Ig =If—TIgq. 
We verify that Io is uniquely defined in this way. Let a second 
decomposition 


P=h-Hm EC, 91EC 
exist alongside (1). We shall prove that If —Ig=If, —Igq,. 
This equation is equivalent to the equation 
If+iIg,=I9g+TIh,. (2) 


But since f + g, =f, +, we have in virtue of the uniqueness of 
the integral in the class C* 


If +9) =19 + fi), 
from which (2) follows. 

We show further that the integral obtained possesses the usual 
linear properties in the class L. Let p = f — g, 9, = f, — 91, where 
f.9,f1,91 belong to the class C+. Theny + 9, = (f+ fi) -g@ +91), 
and by definition 

I¢+o) =Il(f+h)-Ig+a)=1f+Ih-Ig-Ig 
=(Uf—-I9)+Uh -—1n)=Le +i. 
Thus the integrals of sums are equal to the sums of integrals. Fur- 
ther, for a > 0, I(ag)=J(af—ag)=lI(a«f) —I(ag)=alf 
—alg=a(lf—Ig)=al1 9; again, 


I(-9) =I —f) =Ig —If = —I 9, and hence for 
a<0 we have I(ag) =I(—|x|@) = —I(|«|~) = —|x] Zp) 
= «Iq, so that a scalar « canbe carried through the integral sign 
whatever its sign. 


MA. 6a 


158 MATHEMATICAL ANALYSIS 


We observe further that if g = 0, p € L, then Ig = 0. For if 
g=f—g, f€Ct,geC,g =0,thenf =gandlf =JIg; hence 
Iy =If —Ig =0. We make the further deduction that 9, <= ge. 
implies Ig, =I ,. 

3. We now prove an important theorem on the term-by-term 
integration of series with positive terms. 


THEOREM (Beppo Levi, 1906). If for a series D/ go, px € L, 
k=1 


Gr. = 0, the integrals of the partial sums have a common bound, so 
that 


1( 3m) $6, 


k=t 


then the function p = D; ;, is summable and Igy = Dd} I op. 
kel kel 


Proof. We observe first that in the decomposition of a summable 
function gy = f —g, f € C*, g € C*, the functions f, g can be made 
subject to further conditions. For example, g can be chosen so 
that g = 0, Ig <«, where ¢ is a prescribed number. To do this 
we have to consider a sequence of step functions h, 7g such that 
Ig =limTJh,,, and then write 


gy =f—-—g =(f— In) — (9 — Pn) = fn — Gn: 


It is clear that for sufficiently large n, the required condition is 
satisfied for the function g, = g — h,. In addition we observe that 
if p = 0, the function j, = f — h, =f — g = y also becomes non- 
negative. 

Now for each of the functions gy, that occur in the formulation 
of the theorem we construct a decomposition 


Pr = f k — Gies 
where },, 20, 9, 20, Tg, < 1/2* (k = 1, 2, ...). The series 3) g, 
kel 
then satisfies the conditions of the corollary to the theorem of 


Section2 (g, 20, L( Dd) gr) <1). Hence g = 3’ g;, belongs to 
kel k=l 


the class Ct and Ig = 5; Ig,,. We show that the series >’ f, 
k=l k=1 


also satisfies the conditions of the corollary; for we have f,, = 0 


THEORY OF THE INTEGRAL 159 
d n n n 
ani 1( 3 fh) =1( 3 mx) +1( 3 o) SC +1. 
kad ka ket 
foe] fe. *) 
Hence f = 3) f, also belongs to the class C+ and If = 5) I f,. 
k=1 k= 


co foe} foo} 
It follows that gy = D/ go, = DS) fe — D) 9x =f —g belongs to 

k=1 k=l kel 
the class E and 


co foe} co foo) 
Ip =1f-—Ig= J Ifp— S) Ign = SS Te — 9x) = DS IK 
k= kat k=4 ket 


which completes the proof. 

CoROLLARY. If a monotone increasing sequence of summable func- 
tions y, tends to a limit p and Ip, < C, then the function p ts sum- 
mable and 

Iy=limIy,. 


For the proof it is sufficient to put 9, = 2 — Yo os 
Yn = Yn+1 — Yn and apply the preceding theorem. An analogous 
result obviously holds for decreasing sequences y, \y y provided 
Ip, 2 C. 

4. We shall subsequently consider arbitrary (non-monotone) 
passages to the limit. Classical examples show that theorems of 
the form y, > implies Ig, > Ig cannot be expected in the 
absence of additional assumptions about the nature of the conver- 
gence of the sequence ¢, to its limit. For example, the functions 


: cA 
| msinns for 0OSas5—, 
n 


Gn (2) = | 


% 
Ofor—<a<n2x 
n 


converge to zero for every x € (0, x], while their integrals remain 
constant (with the value 2) and do not tend to the integral of the 
limiting function. 

We consider the aggregate L(g») of all summable functions @ 
that satisfy the inequality 

Po =P SH» 
where @ is a fixed non-negative summable function. Evidently the 
inequality 
—Ig, SIgp SIqM. 

6a* 


160 MATHEMATICAL ANALYSIS 


is satisfied for each function m € L(g). If there exists a monotone 
sequence of functions ¢,, increasing or decreasing, that belong to 
the set L(g), the limiting function » will obviously satisfy this 
inequality together with the functions ¢, ; by the preceding corollary 
this function is also summable. Consequently the set L(g) is 
closed in respect of monotone passages to the limit. We observe 
further that we can assert for any sequence @, € (q) that the func- 
tions 


SUP, {p1 (2), P(x), oery Mn(2), aitey 


and 
inf, {—1(%), Po(X), +: Pn(z), ---} 
also belong to L(g): the first of them is the limit as n > oo of the 
increasing sequence of functions 
max {p; (2), ««» pn(x)} € L(p.)s 
and the second is the limit of the decreasing sequence 
min {9 (2); .--, Pn(x)} € L (Ho). 


Now let o, € L(y) be any sequence converging almost every- 
where to some function y; we shall show that y also belongs to the 
class L(g). It is enough to prove that y can be represented as the 
limit of some monotone sequence of functions in the class L(g). 
We put 

Yn(x) = sup {Pn (%), Ons 1(*), wns} ’ 

n(x) = ink (pp (&), Pn 1 (a), --}- 
By what we have just proved, these functions are summable and. 
belong to L(g). If we consider only those values of x for which the 


function , (x) have the limit p(x), it is clear that for each such 
value 


Yn(w) = lim gp, p(x) = y(z), 
p->0o 

pn (x) S lim Pn4p (%) = p(a). 
poo 


Further, by removing y, from the aggregate gn, Qni1, --+» We Can 
only diminish its upper bound and increase its lower bound; hence 


na 1(%) ZS Yn(2), 
Wn41(2) 2 Y'n(x). 


Consequently the sequence y, (x) is decreasing, and y’,(x) is in- 


THEORY OF THE INTEGRAL 161 


creasing. Furthermore, it is clear that , (x) > p(x) implies 
Yn (2) NV y(®), pln (x) 7p (2). 

Thus the function p(x) is the limit of an increasing sequence of 
functions in the class L(y) (and at the same time the limit of a 
decreasing sequence of functions of this class). It follows that 
py € L(g), as asserted. In addition we have ly, Aly, lyn 7ly, 
Iy' SIqg, SI, whence Igy, > Iy. We have proved the fol- 
lowing theorem: 

THEOREM (Lebesgue, 1902). If @ sequence of summable functions y, 
converges almost everywhere to a function p and satisfies the condition 
lpn(x)| S polx) EL, 
then the function p ts summable and Ip = lim Iq, In particular 
the equation Im = lim I, holds if the functions y, have common 

bound. 

From this theorem we can obtain an important result in connec- 
tion with the composition of the class L (y,). We shall show, namely, 
that if some measurable function satisfies (almost everywhere) the 
inequality 

—% SP SHEL, 
then it is summabdle (and therefore belongs to the class L(p,)). For 
let h, be a sequence of step functions that defines the measurable 
function y. Cutting it above by the plane my) and below by the 
plane —q@p, i.e. replacing it by the functions 
Pn = Max {—; min (Fn Po)}» 

we get a sequence of summable functions belonging to the class 
L(yo) and converging almost everywhere to the function . This 
means that g = lim ¢, is a summable function, as required. 

In particular, every bounded measurable function is summable. 

Again, the theorem we have proved on summable functions 
enables us to draw further inferences about measurable functions. 
We shall show that if the limit p(x) of an almost everywhere conver- 
gent sequence of measurable functions @,,(x) is finite almost every- 
where, then it is a measurable function. It is enough to consider the 
case @,(x) = 0, since in the contrary event the sequences gt, @, 
can be considered separately. But if the y, converge to m almost 


everywhere, the sequence of functions y, = will converge 


1+ Q@, 
The functions y, are bounded 


almost everywhere to 


162 MATHEMATICAL ANALYSIS 


by 1 and 0, above and below respectively, and are measurable. 
They are therefore summable and by what we have proved, their 
limit y is also summable and is therefore a measurable function. 
We note that y can vanish only when p(x) = ov, i.e. on a set of 
measure zero. Hence, inverting the equation we have got, we find 
that 


is a measurable function, since both numerator and determinator 
are measurable and the denominator is non-vanishing almost 
everywhere. 

5. In one instance we can assert the summability of the limiting 
function of a sequence gy, when the hypothesis that the functions 
[gn| are bounded by a summable function is replaced by certain 
other assumptions: 

Lemma (Fatou, 1906). If gy, = 0 are summable functions, p, > y 
almost everywhere, and I yp, <C, then the function p is summable 
and 

OsIg SC. 

Proof. We put 


Wn = inf (On, Paar, --} 2O- 


As above, the functions », form an increasing sequence which con- 
verges almost everywhere to the function gy. Further, »y, Sq, 
Lyn SI, SC; by the corollary to Beppo Levi’s theorem the 
function gm is summable and Iy, AI. In particular, 0 =I 
= lim ly, S C as required. 

Now let the function y, = 0 be summable and let I gy = 0; we 
shall show that gy, = 0 almost everywhere. We put 9, = 7 MY; the 
function ¢, tends to a limit y, equal to zero for gy = 0 and infinity 
for @) > 0. But since, by Fatou’s lemma, the limiting function 
must be summable, and in particular measurable, the set of those 
x for which q(x) = co is of measure zero. At the same time the 
set of those x for which g(x) > 0 is a set of measure zero. We get: 
if the integral of a non-negative summable function vanishes, the 
function itself vanishes almost everywhere. 

6. In this article, we shall regard two summable functions as 
the same if they coincide on a set of full measure. 

To be more precise, we pass from summable functions them- 
selves to classes of equivalent functions: two functions are regarded 


THEORY OF THE INTEGRAL 163 


as equivalent if they coincide on a set of full measure. In parti- 
cular, a function f(x) is equivalent to zero if it differs from zero at 
most on a set of measure 0. The functions equivalent to zero evi- 
dently form a subspace £, of the linear space ZL of all summable 
functions; in essence, what we are doing is to pass from space LD 
to its factor-space L/L, (Chapter II, Section 8, art. 4). In space LZ 
we have the quasinorm ||| = I(|~|) (we recall that the quasi- 
norm differs from the norm in that there can exist non-zero ele- 
ments with quasinorm zero). The subspace ZL, consists of those 
and only those functions whose quasinorm is zero. We can thus 
introduce a norm into the aggregate of classes of equivalent sum- 
mable functions by taking it equal to I(|y]), where y is any func- 
tion of a class. In accordance with art. 4, Section 8, Chapter IT, 
we get a normed linear space of classes of equivalent summable 
functions, which we shall term the Lebesgue space. We shall avoid 
unnecessary pedantry in future by speaking simply of a ‘‘summable 
function” rather than of “‘a class of equivalent summable func- 
tions”, whilst we shall write L, or more precisely, L, (a, 6), for the 
Lebesgue space. 

THEOREM (E. Fischer and F. Riesz, 1907). The Lebesgue space 
is complete: every sequence of functions @,, Pg, ..., fundamental with 
respect to the norm | p|| = I(|q|), has a limit in the sense of this norm 
in the space L. 

Before proceeding to the proof of this theorem, we take note of 
two simple facts. In the first place, the elements of any funda- 
mental sequence are bounded in the norm, since from some point 
on they are all contained in a sphere of radius ¢ with centre at 
some point y,. Secondly, to prove the existence of a limit of a 
fundamental sequence @,, it is sufficient to show that some sub- 
sequence gp, (k = 1, 2, ...) has a limit y; the element ¢ will also 
be the limit of the whole sequence ¢, in virtue of the inequality 


le — Gal Sie —erll + ler — Pall 

where the second term on the right tends to zero because the se- 
quence ¢, is fundamental. 

We now prove the theorem. 

Let 1, %2, ++) Pn» + be a fundamental sequence in the space L. 

We can always find a sequence of indices n, < my < +. such 
that forn > n, 1 
Pn . Pry ll < OK (k = 1,2, ...). 


164 MATHEMATICAL ANALYSIS 
: 1 : 
In particular, | @n,., — Yn, < oF? this means that 


1 
T(lP ny. — Prgl) < 5K - 


But then, by Beppo Levi’s theorem, the series of summable func- 
tions >) |@n,,, — Pn,| converges almost everywhere. It follows 
kel 


that the series with partial sums 
N 


Py (Pnpaa — Pre) = Pryss — Pras: 


also converges almost everywhere. This means that the function 9, 
has a limit (almost everywhere) as k > oo. We denote this limit 
by y. Asa limit of measurable functions, the function » is measur- 
able. Since the norms of the functions ¢,,, i.e. the numbers 
I(|@n,|), are bounded, it follows by Fatou’s lemma that the func- 
tion || is summable. The function is therefore both measurable 
and summable. Further, applying the same lemma to the functions 
Y — Yn,, we have: 


Ip — Onl = Tle — Pryl) S sup 1 (ln, — Pryl) 
p>k 
= SUP | Pnp — Prgll- 
prok 


But by taking & sufficiently large, the last integral can be made as 
small as we please, since the sequence ¢, is fundamental. Hence ¢,, 
converges in the norm of the space LZ to » and the theorem is 
proved. 

In conclusion we shall show that the space L, (a, b) is the com- 
pletion of the space C,(a, b) of continuous functions f/(x) on the 
closed interval [a, b] with norm (cf. Chapter IT, Section 8) 


b 
fl = flf@| de. 


The space L, (a, b) evidently contains the space C(a, b) as a sub- 
space with the same metric. It is therefore sufficient for us to prove 
that the subspace C,(a, 6) is dense in Z,, so that each function 
gy € L, can be represented as the limit of a sequence of functions 
fn(z) € C,. We can easily satisfy ourselves that every step function 
h(x) possesses this property. Again, since each function » € L, is 


THEORY OF THE INTEGRAL 165 


the difference of two functions in the class C+ it is enough to verify 
our assertion for functions in the class C+. Let » € C* and let 
h, 7 be a sequence of step functions. Then 


lp — hall =I(p — fn), 
and since pg — h, \, 0, it follows from Beppo Levi’s theorem that 


which proves the theorem. | 
We have seen (Chapter IT, Section 3, art.3) that the set of 


polynomials is everywhere dense in space C'(a, b), and the set of 


trigonometric polynomials T(x) = 3’ a,coska +b,snkz is 
k=O 

everywhere dense in space C,(—z, x). Since C (a, 6) is everywhere 

dense in its completion (a, 6), we can conclude that the set of 

polynomials is everywhere dense in space L, (a, 6) for any a, b; 

similarly, the set of trigonometric polynomials is everywhere dense 

in £,(--%, 2). 


1. If a summable function { vanishes outside a closed interval [«, 8] which 
is interior to the original closed interval [a, 0], so that it can be “displaced”, 
then it is “continuous in the integral sense” in the space Z: for any ¢ > 0, 
there can be found 6 > 0 such that for |h] < é: 


f(z + h) — f(x) <e. 


Hint. Show that the set of all / for which (*) is satisfied is closed in the 
space L. Then prove (*) for continuous functions. 


4, Measure oF SETS AND THEORY OF LEBESGUE INTEGRATION 


1. A set A contained in the closed interval [a, 6] is said to be 
measurable if its ‘‘characteristic function”, i.e. the function ¥ (2), 
equal to 1 on A and 0 on the complement of A, is measurable (and 
consequently summable). The integral of the characteristic func- 
tion is called the measure of the set A and is denoted by uw A. If A 
is a set of measure zero in the sense of the definition of Section 1, 
then, as we showed at the time, the integral of the function  , (2) 
vanishes, so that the set A has measure zero in the new sense too: 
pA = 0. Conversely, if the integral of the characteristic function 
of a set A vanishes, then in agreement with the observation follow- 
ing Fatou’s lemma, A has measure zero in the former sense. Thus, 


166 MATHEMATICAL ANALYSIS 


for a set of measure zero, the new definition coincides with the 
earlier one. 

The properties of integrals with which we are already familiar 
enable us to obtain the properties of measurable sets. Thus the 
union A of a finite or countable aggregate of measurable sets 
A,, Ag, ..., An, ». isa measurable set since its characteristic func- 
tion can be defined by the formula 


ya(x) = sup {ya4, (@), ---, 14, (2), -}- 
In the case that the sets A,, A, ..., are disjoint, we have 

KAl®) = Za,(%) +o + Ya, (@) + 
and by Lebesgue’s theorem (Section 3) 

BA =pAyt + uM Ant (1) 
This property is called the total additivity of measure. Further, if we 
subtract a measurable set A, from a measurable set A, which con- 
tains it, we get a new measurable set A,, since 
XAs a As i XA 


In this case, 4 A; = A, — yw Ay. In particular, since the closed 
interval [a, b] is measurable, the complement of any measurable 
set Ac [a,b] is measurable. It follows further that the inter- 
section of a sequence of measurable sets 4,, Ag, ... is also measur- 
able, since its complement 


lo} foe} 
C [[T An = DS) CA, 
n=l n=l 
is a measurable set. We observe finally the following limiting rela- 
tion: 


4] A,cC4A,cC+CA,C.. and A=UA,, 
nel 
then 
UA =limp A,. (2) 
n->oo 

For we have: 

A = A, + (A, A)) (A, Ag) MSE (Angi A,,) ret 
A = WA, + w(dg — Ay) + + + M(Anga — An) + 

= Ay (u Ay — w Ay) + + (@ Any, — @An) +o 


= lim pw A,. 


n->oo 


THEORY OF THE INTEGRAL 167 


If A,, Ag, ... is an arbitrary system of measurable sets then the 
sets A, = A,, Ap =A, + Ag, .., A, =A, to + An, are 
also measurable and are successively embedded in one another; 

Co 
by what we have proved, we have for A = U A, 
=1 


HA =limp A j= limp[A, + + + AQ]. 
In particular, since it is always the case that 
p(Al + A”) = (Al + [A" — A’ A") = (A!) + (AY — A'A") 


we have SuA'+ pA", as 
BA =limp A, S lim [w Ay + > + An] S SD An. (8) 
n->0o n=l 


The analogous results for intersections are obtained by taking 
complements: for example, if we are given measurable sets 


A,DA,>+:A,D ++ and A = [J A), then 


n=1 


BAS eI] ae De aT Ae: (4) 


noo k=1 


Any interval (closed, open, or semi-open) is evidently a measurable 
set and its measure equals its length. The foregoing formulae 
show that any open set, as the (at most) countable sum of disjoint 
open intervals, is measurable and its measure equals the sum of the 
lengths of its component open intervals. Further, any closed set 
is measurable since it is the complement of an open set. In general 
any set which can be obtained from open intervals by an at most 
countable number of successive operations of forming unions, inter- 
sections, or complements is measurable. 


2. It is natural to conjecture whether, in general, there exist unmeasurable 
sets. The remarks above show that the problem of constructing a concrete 
unmeasurable set is certainly complex, since the usual methods of construc- 
tion, which commence with open intervals and employ countable unions 
and intersections, fail to lead beyond the domain of measurable sets. Up 
till now no individual instance of an unmeasurable set has been constructed. 
Nevertheless we cannot doubt the existence of such sets. We give the 
appropriate reasoning, following N. N. Lusin. It is convenient to imagine 
the closed interval [0, 1], on which we shall carry out our constructions, 
rolled into a circle of unit circumference, so that the points 0, 1 coincide. 
We shall measure all distances along this circumference. Points ¢, 7 will be 
called allied if the distance between them is rational and hostile if it is 
irrational. It is evident that if € is allied to 7 and 7 is allied to ¢, then € is 
allied to &. We shall call the totality of points situated at rational distances 
from a given point its family; all the members of a family are mutually 


168 MATHEMATICAL ANALYSIS 


allied and are hostile to any point that does not belong to the family. Families 
A, B which contain prescribed points «, 6 respectively either coincide (if «, 8B 
are allied) or have no element in common (if «, # are hostile). The totality 
of points of the closed interval [0, 1] is the union of some set of distinct 
families. The set of all families is certainly uncountable, since the set of all 
points of the closed interval [0, 1] is uncountable. We imagine now that a 
representative of each family is somehow chosen. We denote by Z the aggre- 
gate of all these representatives and claim that no matter how the represen- 
tatives are selected, the set Z will be unmeasurable. (We cannot indicate a 
possible rule for selecting the representatives, and it cannot therefore be 
said, in effect, that the set Z has been constructed). Let 7, 72, ... be a sequence 
containing all the rational numbers; we denote by Z, the displacement of 
the entire set Z through the distance 7, so that Z, is composed of points of 
the form € + r, , where & runs through the whole of Z. 

It clearly follows that the sets Z,, Z,, are disjoint if m+n; otherwise 
we should have an equation + 7, = 7+ 1%,, where €, 7 belong to Z; but 
then the difference  —-7=1,,— 7, would be rational, i.e. & 1 would be 
allied, which is precluded by the construction. Further, every point of the 
closed interval [0, 1] belongs to some Z,, since every point A has the form 
£+7,, where ¢ is the representative of the family containing 4. Thus the 
sets Z,, Z, ... are mutually disjoint and their union is the whole closed 
interval [0, 1]. If the set Z were measurable, all the sets Z,, Z,, ... would 
also be measurable and their measures would be equal to that of Z. In virtue 
of the countable additivity of measure, we should then have to have 


MA 4+ uZ tet us, t+e=u s+ usu t pu Zeal. 


But this is not possible for any value of u Z: neither for u Z < 0 nor for 
uw Z=0. Thus the set Z cannot be measurable. 


Problems. 1. Given any measurable function / and number c, show that 
the set H = {x: f(x) = c} is measurable. 


CO 0 co 
Hint: E= TT UIT {ath,,(2) > ¢—1/k}, where h,, > f is a sequence of 


keuln=almen 
step functions. 
2. If f, > f almost everywhere, given any c > 0, (5) 


Jim (2: |f- hl B= 0. 


ee wit 0 {e: f—f, 20} = 0. 

3. A sequence of measurable functions /,(x), satisfying (5) for any c > 0, 
is said to be convergent in measure to the function /. Show that, though the 
sequence /, itself may not be convergent almost everywhere to /, it must 
contain a subsequence convergent almost everywhere to /. 

Hint. Given any integers k and m, there exists an n = n(k, m) for which 
HX: law, m — FL > 1k} < 1/k 1/2m. The functions jf, q, , converge to f as 
k—> oo on a set of measure >b — a — 1/m. The required sequence is ob- 
tained from /,,, » by @ diagonal process over m. 


THEORY OF THE INTEGRAL 169 


4. If every subsequence of a sequence of measurable functions /, contains 
a subsequence which converges almost everywhere to a fixed function f, 
the /, converge in measure to /. 

Hint. Use the result of problem 2 and reductio ad absurdum. 

5. If f 2 0 is a summable function and y{x: f(x) 2c} 2 b, then If 2 be. 

Hint. The assertion is obvious for step functions. In the general case, 
use the formula of problem 1 to obtain the inequality 


b Spfe: f(z) 20} Spach, (2) <c— hte 
for any k and « and sufficiently large m. Hence 
Ih, < (b — &) (¢ — 1/k). 

6. If 7, 2 Oand J f, > 0, then /,->0 in measure. (Convergence almost every- 
where does not follow from these conditions.) The condition /, = 0 cannot be 
dispensed with. 

Hint. Use problem 5. 

7. A metric is defined on the space of all functions /(~) measurable on the 
closed interval a < x s b, in accordance with the formula 
Show that the metric space axiom is satisfied, and that convergence in the 
metric (6) is the same as convergence in measure. Prove that the metric 
space is complete. 

8. We denote by a, (x) the function equal at the point x € [0,1] to the 
kth place of the dyadic number 2. Show that the functions 


on (2) ==. Ea, (c) 


1 
satisfy the equation i [o, (2) — 1/2 da = 1/4n. 
6 


1 
Hint. f a, (a) a, (a) da = 1/4 for any j +h. 
0 


9. (continued). The sequence op: 2(x) is convergent almost everywhere to 
f(x) = 1/2. 

Hint. w{x: |o,(2) — 1/2| > 2} S 1/4n e for any e > 0. 

10. (continued). The sequence o,,(x) is convergent almost everywhere to 
f(a) = 1/2. ‘ 

Hint. If m? <n < (m + 1)*, then (1/n) 2) a,(x) < (2m + 1)/m?*. 

mt +1 

Note. The result of problem 10 shows that, for almost any real number 
of the closed interval [0,1], the share of zeros and unities in its dyadic 
resolution, defined naturally as the limit of the function o,,(z), is equal to 1/2. 

3. We shall now examine the structure of a measurable set of 
positive measure. We shall show that every measurable set A of 
positive measure can be regarded, to within a set of arbitrary small 
measure, as the union of a finite number of closed intervals. More 
precisely, for any given ¢ > 0, we undertake to find a finite system B 


170 MATHEMATICAL ANALYSIS 


of closed intervals such that the set A can be obtained from B by 
adding a set a of measure <eand removing a set b of measure also <e. 

For the proof we represent the measurable function y 4 (x) as the 
limit of a sequence of step functions h,,(x) which converges almost 
everywhere. If we replace each function h, by a function h;, equal to 0 
for h, < 1/2 and 1 for h,, = 1/2, we obtain a new sequence of step 
functions, which also converges almost everywhere to y 4 and which 
assumes only the values 0 or 1. Each of the functions hj, is the 
characteristic function of some set B, that represents a finite 
system of closed intervals. We claim that for sufficiently large n 
the set B,, satisfies the condition stipulated. For consider the func- 
tions (hy, — y4)*, (kn — xa). The function (hi, — xa)* is the cha- 
racteristic function of the set b, composed of points which belong 
to the set B, but not to the set A; the function (h;, — y4)~ is the 
characteristic function of the set a, of points which belong to A 
but not to B,. The set A is obtained from the set B,, by removing }, 
and adding a,,. 

But 
M(Gn) + (On) = Tn — Za)” + LT (hn — wa)* = L((hn — yal) > 9, 
which completes the proof. 

4, The measure of a measurable set as its upper measure. The 
theorem we have proved, in conjunction with the foregoing obser- 
vations, enables us to formulate a new, more constructive definition 
of a measurable set. We cover an arbitrary set A on the closed 
interval [a, b] with a finite or countable system of intervals and 
find the sum of their lengths. The exact lower bound of the sums 
obtained under all possible such coverings is denoted by u* A and 
is said to be the upper or exterior measure of the set A. Thus a set 
of measure zero has exterior measure zero by its very definition. 
We shall show that the exterior measure of any measurable set A 
coincides with the quantity uw A. 

If B is a system of non-overlapping intervals covering a set A, 
then, as above, B is a measurable set with measure equal to the 
sum of the lengths of the intervals; since A < B,wehaven A SpB 
and hence the exterior measure of the set A 


we A = infpBepa. (5) 
BDA 


On the other hand, using the theorem proved above, we can con- 
struct a finite system B,, of intervals for any given e > 0 and any n, 


THEORY OF THE INTEGRAL 171 


such that 
A+b,=B, +a, (6) 


where each of the measurable sets a,, b, has measure < ¢/2"* 1, 


~ foe] wo 
Let B= U B,,a = U a,,6 = U b,; obviously, 


n=1 n=l n=1 
pa=lim wa, =0,ubS Dd) wb, < 1/2¢. 
nel 


We have the inclusions 
AcB+a, (7) 
BcA+tb. (8) 

For, if a € A does not belong to B, it does not belong to any 
B,,, and since A c B, + a,, belongs to all the a,, and therefore 
to a; this proves (7). 

Further, we take a point y, belonging to B and hence to some B,,. 
If y € A, it follows from (6) that y belongs to 6, and therefore to b; 
this proves (8). Notice that (8) implies ~.BswA+pb<pAt 
+ 1/26. 

hs return to inclusion (7). In the long run the set B is a system 
of intervals, finite or countable. The set a, as a set of measure zero, 
can be covered by a system of intervals of overall length <1/2«. 
In short, the whole set A can be covered by a (finite or) countable 
system of intervals of overall length not exceeding uw B + 1/2¢ 
SwA+te. Hence p* A SwA +e. Since ¢€ is arbitrary, u* A 
< yA, and combining this with inequality (5), we get that for 
any measurable set wtAapa. 


In other words, the measure of any measurable set can be defined 
constructively as its exterior measure. 

5. The question naturally arises whether measurable sets them- 
selves can be defined in terms of exterior measure without involving 
the concept of the integral. The answer to this question is as follows: 
a set A on the closed interval [a, b] 1s measurable if and only if the 
sum of the exterior measures of A and its complement C A (in [a, 6)) 
is equal to the length of {a, b]: 

pA + pFCA =b—a. (9) 
The necessity of this condition is immediately apparent; if A is 
measurable, then so is its complement and the equation (9) follows 
fromy A + uCA =b — a. We prove the sufficiency of the condi- 
tion. Let (9) hold for a set A. Then for any « > 0 we can find a 


172 MATHEMATICAL ANALYSIS 


covering of the sets A, CA by systems of non-overlapping intervals 
U, V such that the overall sum of their lengths 

wU+pV <b-ate. 
We denote the characteristic functions of the sets U, V by hy, hy. 
The function 1 — Ay is non-zero only within A, and therefore 

0 sl—hy Sys Shy, 
where y, is the characteristic function of A. 

It follows that 

0 SI(1—hy) SThy. 

At the same time, by hypothesis 
LThy —I(0 —hy) =Ihyp + Thy -—-(b-a) =pU+pV— 
— (b—a) <e. 

As ¢ -> 0 the sequence hy (= hy ;¢)) can be taken to be monotone 
decreasing and the sequence 1 — hy monotone increasing. Hence 
the sequence hy — (1 — hy) is also monotone decreasing. Its limit 
f is a non-negative bounded measurable function, the integral of 
which vanishes, by the corollary to Beppo Levi’s theorem; hence f 
vanishes almost everywhere. But since 

hy — (l —hy) Zhu — ya 20, 
we have y4 = lim hy and it is therefore a measurable and sum- 
>0 


mable function. This proves the sufficiency of the condition formu- 
lated. 

Note. The theorem proved can be formulated in the following 
somewhat more general way: a set A contained in the closed interval 
[a, b] ts measurable if and only if there exist sequences of measurable 
sets U,,, V,, such that U, > A, V, > CA, 


b-aspu,+pVn Sb6-—aA+e, >. 

6. In his development of measure theory and the theory of the 
integral, Lebesgue started from the definition of a measurable set 
in just the form of the last theorem. That is, he called a set A 
measurable if it satisfied the relation 

we A+ p*CA=b—a. (10) 


This definition is sometimes expressed in another form: the interior 
(lower) measure of a set A is defined by the formula 


bw, A =supyF, 
FCA 


THEORY OF THE INTEGRAL 173 


where the upper bound is taken over all closed sets F contained in 
the given set A; the measure of a closed set F is taken as the 
number b — a — 4 CF, where CF is the open set complementary 
to F. Further, A is said to be a measurable set if 


by A = pF A. (11) 

It is easy to verify that the definitions (6), (7) are equivalent. For 

fy A = sup F = sup {b —a—pwCF}) =b—a— inf uCF. 
FCA FCA FGA 


But if a set F is included in a set A, its complement C F covers the 
set CA; it follows immediately that 


My A=b-—a-—p*CA, 
which implies that the equations 
My A = p* A, p*®¥ A+ p*CA=b—-a 
are equivalent. 
Having established the fundamental properties of measure, of 
which we were speaking at the beginning of the paragraph, on the 
basis of the definition (10) [or (11)], Lebesgue proceeds to the defi- 


nition of measurable functions. He calls a function g(x) measurable 
(and we call it Lebesgue-measurable) if any set of the form 

{xz p(x) So} 
is measurable, whatever the value of the real number c. 

For the moment we shall call functions which are measurable 
in the sense of Section 1 measurable in the sense of Riesz. We shall 
show that the definitions of functions measurable in the senses 
of Lebesgue and Riesz are equivalent. 

Let (x) be measurable in the sense of Riesz; we introduce the 
function 

9. (v7) = max {p(z), c}. 
The ratio 
Pere —~ Pe 
€ 

is also a measurable function in the sense of Riesz, vanishing for 
g(x) 2c + ¢ and equal to 1 for m Sc. Its limit as e > 0 is the 
characteristic function of the set {x: g(x) Sc}, which is therefore 
also measurable in the sense of Riesz. This means that the set itself 
is measurable in the sense of Riesz and consequently also in the 
sense of Lebesgue. 


174 MATHEMATICAL ANALYSIS 


Conversely, if a function g(x) is Lebesgue-measurable, i.e. if all 
sets of the form {x:@(x) <c} are measurable, then sets of the 
form 

{we < pla) Sea} = (e: p(e) Se} — (x: (2) Sq} 
are also measurable for any ¢,, Cy. 

Let 4,,-, be the characteristic function of this set. For a given n, 


we put a function , equal to m/n at those points where the ine- 
quality 


n 


is satisfied. The function ¢, is the sum of an everywhere convergent 
series of functions measurable in the sense of Riesz 


om 
Pale) = SI Xm, mer () 


and is therefore itself measurable in the sense of Riesz. It is evident 
that the inequality 


1 
lpn (x) —p (x)| S— 


holds. Thus (2) is the limit as n + co of the sequence of Riesz- 
measurable functions g,(x) and is therefore measurable in the 
sense of Riesz, as required. 

Lebesgue goes on to construct the definition of an integral for 
a bounded measurable function y(x). To do this, he partitions the 
domain of variation [m, M] of the function g(x), into parts by 
the points m = yp < y, < --- < y, = M and forms the “integral 


bP) 


sum 


n-1 
Sn (@) = 219i uae ys <9(%) Syj41}, (8) 


where the numbers w{x: y; < @(v) Sy;,,} are meaningful since 
(x) is measurable. When the closed interval [m, 1] is partitioned 
indefinitely, the sums s;;(y) tend, as Lebesgue proves, to a (uni- 
quely determined) limit which is called the Lebesgue integral of 
the function ¢(z). 

We verify that the Lebesgue definition agrees with the Riesz 
definition assumed in our account. The function my(x) equal to y; 
on the set {v: y; <@ S yj,1} is measurable and its integral in the 


THEORY OF THE INTEGRAL 175 


sense of Riesz is precisely equal to the quantity s7(y) (8). Since it 
differs from the function p(x) by at most max y;,, — y;), the func- 
tion yz(x) tends uniformly under an unlimited partitioning of the 
interval [m, 1] to the function (x). It follows that Ig =lim I my, 
which proves the existence of the Lebesgue integral and its equi- 
valence to the Riesz integral. If the function v(x) is unbounded 
but non-negative, Lebesgue proceeds as follows. Truncating the 
function p at the value JN, i.e. considering the function 


pn (x) = min (p(z), NY), 


he obtains a non-negative, bounded, measurable function. Its 
integral I my, which exists in virtue of the foregoing argument, 
is non-negative and increases (more precisely, does not decrease) 
as N increases. If the numbers I my have a finite limit as N > oo, 
the function —p is said to be swmmable in the sense of Lebesgue and 
its Lebesgue integral is put equal to lim I gy. In the general case, 
where y can take any sign, it is said to be summable in the sense 
of Lebesgue if its positive and negative parts yt, p- are summable 
in the sense of Lebesgue; its integral is then constructed according 
to the formula 
Ip =Igp* —Iq-. 


We verify now that the class of summable functions arising from 
our definition (summable in the sense of Riesz) coincides with the 
class of functions summable in the sense of Lebesgue, and that 
the values of corresponding integrals coincide. It is sufficient to 
establish this for non-negative functions. If a non-negative func- 
tion g(x) is summable in the sense of Lebesgue, it is the limit of 
an increasing sequence of bounded summable functions gy; it 
follows by Beppo Levi’s theorem that y is summable in the sense 
of Riesz and we have Ig = lim I py, ie. its integral in the sense 
of Riesz, coincides with its Lebesgue integral. Conversely if g(x) =0 
is summable in the sense of Riesz, then all the functions my are 
bounded, and measurable and their integrals do not exceed I gy. 
Since the I my are bounded, they have a finite limit and hence ¢ (x) 
is summable in the sense of Lebesgue; by the foregoing argument, 
its integral coincides with IJ y. Thus the results of the Lebesgue 
and Riesz constructions coincide completely. We have preferred 
to expound the construction originally given by Riesz, because as 
regards the technique of proofs it is somewhat simpler on the whole 


176 MATHEMATICAL ANALYSIS 


than the construction of the Lebesgue method and permits of a 
direct construction of the theory of the integral with an avoidance 
of the rather unwieldy Lebesgue theory of measure. 

7. Integration over a measurable set. Up till now our domain 
of integration has been the closed interval [a, b]. But an integral 
can easily be defined over any measurable set E c [a, b]. Let yz(x) 
be the characteristic function of the set EZ; we shall call a function p 
summable on the set E if the product yz y is summable on [a, b], 
and we define 


b 
fede = fxuspdz = I(yr9Q). 
E a 


The integral over H evidently possesses all the usual properties 
of the integral. We note some of these specifically. 

(a) If yp is summable on H = EB, + E, + -- and the (measurable) 
sets E,, H,, ... are mutually disjoint, then y is summable on each E, 
and 


fodz =fopdet fodr+- (12) 
E Ey Ey 


For since yz, %z = Xz,, the function yz, p = yz, (7x ¢) is sum- 
mable on #,,. Further, yz, + yn, + -- = Xz, hence 


LEP = XEP T= LEP: 


and it follows that the partial sums of this series are bounded by 
the summable function y; |y|. Integrating the series term by term, 
we get (12). 

(b) The converse result, that a function m is summable on F if 
it is summable on each H,, and the series (12) converges, can only 
be substantiated under the additional hypothesis that g = 0 on 
each £,,. 

In this case 7 ¢— is the limit of the non-decreasing sequence of 
partial sums of the series yz,9 + Yn,9 + -:, the integrals of 
which have a common constant bound; applying Beppo Levi’s 
theorem, we get that 7; y is summable and equation (12) holds. 

Property (b) is sometimes applied in the following equivalent 
form: if a function y is non-negative and is summable on each of the 


sets EH, CE, c --- and the integrals f y dz are bounded by a fixed 
Ey 


THEORY OF THE INTEGRAL 177 


eo 
constant, then y is summable on E = 3D) HE, and 
nel 


fede =lim fede. 
E 


n>ok, 


(c) Absolute continuity of an integral over a set. The integral of a 
summable (on [a, b]) function —, taken over a measurable set E, tends 
to zero together with the measure of this set independently of its dis- 
position on [a, b]. In other words, for any ¢ > 0 there can be found 


6 > 0 such that wu <6 implies [ |g| da <e. For the proof we 
E 


find a bounded measurable function A(x) = 0, dependent on the 
prescribed « > 0, for which 


b 
flee) —M@lae <5. 


For example, let h(x) be bounded by a number M, so that 
0 sh(x) <M. Then for any measurable set # of measure less 
€ 


than 6 = Wk 


fiplas < [lee — h(w)| de + [ede <$+oM <e, 
E E E 


as required. 

8. We dwell finally on one further important theorem in the 
Lebesgue theory of measure: 

TuEeorEM (D. F. Egoroff, 1911). Let there be given a sequence 
1(X), Ga (X), «5 Pn (x), -.. of measurable functions which converges 
almost everywhere on [a, b] to some limit p(x). For any ¢ > 0 there 
exists a measurable (even closed) set A of measure >b —a—eon 
which the sequence converges uniformly to its limit. 

Proof. The limit function y (x) is measurable. By considering the 
functions » — gy, in place of the g,, the problem can be reduced 
to the case of a sequence that converges to zero; we shall therefore 
assume at the outset that (x) =0. Further, the functions ,, can 
be supposed to be non-negative and to tend monotonely to zero; 


178 MATHEMATICAL ANALYSIS 


if this is not the case, we can substitute sup {|Q,|,|@nai], --} for 
Pr- 
We denote by A” the set of points on which 0 < y,(x) S I/m. 
For fixed m the set A” becomes larger as increases and the 
system of all these sets covers the whole closed interval [a, 6}. 


Hence lim « AY = 6 — a and we can find a number 2 = n(m) 
n->0o 


such that 


We now consider the intersection A of all the sets Aj in 
(m = 1, 2, ...). Since the complement of A has measure 
oOo 
é 


BCA =e (TT Atm} = BU CAR im S 


| oe =e 
2 


the measure of A itself is at least b — a — e. We claim that on 
the set A the convergence y, > 0 is uniform, i.e. for a given 1/m 
there exists a number N such that, for » = N, the inequality 
Qn (x) < 1/m holds everywhere on A. For as N we can take n(m), 
since A < Aj, and hence ifn > n(m), theny, (x) S Gn) <1f/m 
at all points of A. Remembering that yA = supyu F, FCA, 
we can replace the measurable set A by a closed set of arbitrary 
close measure. The theorem is proved. 

It is not difficult to see that in the formulation of the theorem 
the closed interval [a,b] can be replaced by any measurable set. 
A consequence of Egoroff’s theorem is a theorem due to Lusin 
which establishes a new feature of measurable functions: 

THEOREM (N.N. Lusin, 19151). A function p(x) defined on the 
closed interval [a, 6] is measurable if and only if for any e > 0 
there exists a closed subset A c [a, b] of measure > b — a — & on 
which (x) ts continuous. 

For if h,,(x) > y(x) is a sequence of step functions that defines 
the measurable function g(x), we can cover the set of measure 
zero comprising the points of divergence of the sequence and the 
points of discontinuity of the functions h,,() with a system of 
open intervals of overall length < «/2. Further, by Egoroff’s 
theorem, we can remove from the set that remains a subset, also 


+ D. F. Egoroff (1869-1931), N.N. Lusin (1883-1950) are Muscovite 
mathematicians, the founders of the Moscow school of the theory of functions 
of a real variable. 


THEORY OF THE INTEGRAL 179 


of measure < ¢/2, such that the sequence A,(x) will converge 
uniformly on the remainder; we can assume also that the subset 
removed represents a system of open intervals. On the remaining 
closed subset F of [a, b], the measure of which exceeds b — a — ¢, 
the functions h,(x) are continuous and converge to their limit 
g(x) uniformly; it follows that g(x) is continuous on F. 

We leave a proof of the converse assertion to the reader. 


Problems. 1. Show that to within a set of measure zero every measurable 
set is the intersection of a sequence of open sets and the union of a sequence 
of closed sets. 

2. Show that an aggregate of measurable sets, every pair of which differ 
by more than a set of measure zero, has at most the power of the continuum. 

Hint. Use problem 1 and problem 5 of Chapter II, Section 2. 

3. Construct a measurable set which together with its complement, has 
positive measure on every open interval. 

4. Let A be a set of measure =e on the closed interval [0,1] and let 
&,, ..., &, be arbitrary numbers in the interval with n > 2/e. Show that A 
contains a pair of points the distance between which coincides with the di- 
stance between some pair of the points £,, &, ..., &.- 

Hint. Show that the sets &, + A, ..., &, + A (which lie in the closed inter- 
val [0, 2] cannot be mutually disjoint. 

5. Show that every set of positive measure contains a pair of points a 
rational distance apart. 

Hint. Use the preceding problem. 

6. A set A on the closed interval [a, 6] has measure >1/2(b — a). Show 
that A contains a subset of positive measure, symmetrical about the mid- 
point of the interval. 

Hint. Consider the intersection of the set A with its reflection through 
the mid-point of the interval. 

7. Measurable functions f, g are said to be equi-measurable if for any ¢ 
we have u{z:f >c} = uw {a:g > c}. Show that: (a) any measurable function 
is equi-measurable with some non-decreasing function; (b) equi-measurable 
non-decreasing functions coincide almost everywhere. 

8. Given an arbitrary sequence /,(z), ..., /,(%) of measurable functions 
on the closed interval [a, b], show that the set # of points where lim /,,(x) 
exists is measurable. ne OO 


5. GENERALISATIONS 
1. The Case of an Infinite Interval 


All the preceding investigations have related to functions de- 
fined on a closed interval [a,b] of the z-axis. But the infinite 


180 MATHEMATICAL ANALYSIS 


intervals [a, 00), (— 00, 6], (— oo, co) present no difficulty. In all 
these cases we shall call a function h(x) a step function if it assumes 
constant values on a finite number of finite open intervals 
A; = (%, %j41); on the remainder of the infinite interval it is 
assumed to vanish. A function (x) is said to be measurable if it 
is the limit of a sequence of step functions which converges almost 
everywhere on the whole infinite interval. The integral of a step 
function h(x) that assumes the value b; on an interval A; of length 
[A;| G = 1, 2, ..., &) is defined naturally by the formula 


k 
Th= 34; 
j=l 


The class C* is defined as before as the aggregate of functions f (x), 
each of which is the limit of an increasing sequence of step func- 
tions h,(x) with bounded integrals. The class Z of summable 
functions is defined as the class of differences » = f — g, f € C* 
g €C*. All the theorems of Sections 2-4, with a single excep- 
tion, are preserved without change: in the exceptional case, bound- 
ed measurable functions are no longer summable in general. In 
particular, the equation Ig = lim I y, ceases to hold if in con- 
verging almost everywhere to a function g,) the functions 9, 
retain a common bound; the original condition |g,| Sy, where 
the function gy is summable, remains sufficient for it to hold. In 
proving the theorem that the limit of a sequence of measurable 
functions is a measurable function (Section 3, art. 4) it will be neces- 


1 
sary in this connexion to replace the function y, = er ie by 


on = aa where h is a strictly positive summable function. 
é 

The remaining results carry over without change to the case of an 

infinite interval. In particular, it is possible to define on an infinite 


interval A a normed linear space L, (4) (the Lebesgue space) of all 
summable functions with the norm 


lol = flp(@)| de. 
A 


It is proved in just the same way as for a finite interval that the 
space L is complete and that the step functions constitute a set 
everywhere dense in it. 


THEORY OF THE INTEGRAL 181 


2. The Case of Several Independent Variables 


We restrict ourselves to the case of two variables x, y and 
functions p(x, y) defined on the rectangle D = {a, Sax sb, 
a, Sy Sb}. 

A set on the rectangle D is said to be of measure zero if for 
any € > 0 it can be covered by a finite or countable system of 
rectangles D; = {a, <a <b", a.) <y <b}, the sum of the 
areas of which is at most e. Thus any closed interval or linear set; 
is a set of measure zero. 

If D is subdivided into a finite number of rectangles D,, ..., Dn, 
a function which is constant on each of the D; is said to be a 
step function. The integral of a step function h(x, y) which assumes 
the value 0; on the rectangle D, is defined by the formula 


Ih= dy) 5;|D;l, 
j=l 


where | D;| is the area of D;. 

As before, the class C+ is the totality of functions, each of which 
is the limit of an increasing sequence of step functions h, with 
bounded integrals I h,. The class L—the Lebesgue space —is the 
class of differences pg =f —g, f€ Ct, g € C+. The theorems of 
Sections 2-4 carry over without change to the case of two variables, 
discounting notation [we write f(z, y) instead of f(x) and use the 
word ‘‘rectangle”’ instead of ‘‘interval”’ etc.]. 

But an important new problem arises here, namely that of 
repeated integration. 

In classical analysis a double integral (initially defined as a limit 
of Riemanian integral sums), say of a continuous function 
f(z, y), reduces to two ordinary integrals with respect to one 


variable: 
by by by Dy 
J fi@ydxdy = f | Jie nay hae. 


Qy ay a as 


In the Lebesgue theory of the integral a similar formula obtains. 
The corresponding theorem is formulated as follows: 

THEOREM (G. Fubini, 1907). Let (x, y) be a summable function 
on the rectangle D = {a, Su Sb,,d, Sy Sb,}. Then: (1) regarded 
as a function of the argument x with y fixed, it is summable for almost 
all values of y; (2) tts integral over the closed interval ay Sx Sb,, 


MA. 7 


182 MATHEMATICAL ANALYSIS 


which we denote by 
vi z P (x, ¥ ) > 
is summable, as a function of y, on the closed interval ag <y < by; 
(3) we have 
Ly {E. oe, y)} = Io. 
Interchanging x and y, we get the parallel result 
I, {Ly p(x, y)} = Ig. 


Proof. It is sufficient to prove the theorem for a function m € Ct. 
Let h,,(x, y) be a monotone increasing sequence of step functions 
which converges to the function y(x,y) on a set of full plane 
measure. We form the functions 


The functions g,(y) are at’ any rate determined for all y that do 
not correspond to lines of discontinuity of the h,, (v, y). The sequence 
gn(y) increases monotonely as n — co, and the integrals of the 
9n(y) have a common bound: 


Ty Gn(y) = Lyle hn (@, y)] = Thine, y) AL ot. 


By the corollary to Beppo Levi’s theorem, the functions g,(y) 
converge for almost all y to some summable function g(y), and 


T,g(y) = lim Ly gn(y) = Ig. 
n>oo 
Let y be a point of the set H, of full measure on which the function 
g(y) is defined and finite. With this value of y, the functions 


h, (x, y) as functions of « form a monotone increasing sequence, 
and their integrals are bounded: 


Ly hin (@, yY) = Only) A gly). 


Hence, in virtue of the same theorem of Beppo Levi, the functions 
hy (x, y) tend for almost all x, i.e. on some set H,, of full measure 
with respect to x, to some function (x, y), so that 


lim I, hy (a, y= g(y) =I, Po (x, y). 
n->0o 


We consequently obtain the result 
Ip = Ty 9(y) = Ly {le Po(% W)}- (1) 


+ The formula for repeated integration is of course valid for step functions. 


THEORY OF THE INTEGRAL 183 


The function g(x, y) coincides with the function g(x, y) on the 
intersection of the sets H and H,,, since the limit of the sequence 
ha(a, y) is p(x, y) on the first of these, and g(x, y) on the second. 

In particular, if h(x, y) converges everywhere to the function 
(x, y) then on the set E,,, @o(z, y) = (x, y), and the assertion 
of the theorem holds. 

The last condition is realised, for example, if the functions 
Ins. — ’n = €n are the characteristic functions of non-intersecting 
rectangles D, (n = 1,2, ...). In this case the integral Ip is the 
(countable) sum of the areas of these rectangles. The function 
In(y) = I, h,(x, y) is the sum of the lengths of the intervals formed 
by the intersection of the corresponding horizontal line with the 
first n-rectangles; the function g(y) = lim g,(y) is the n (coun- 
table) sum of the lengths of all these intervals. 

Formula (1) permits us now to establish an important property 
for any plane set Z of full measure. We claim that the set Z is 
intersected by almost all horizontal lines (i.e. excepting a set of linear 
measure zero in y) in a set of full linear measure in x. 

For let us form, for a given ¢ > 0, a covering of the complement 
CZ of the set Z by a system of non-intersecting rectangles 
Dy, »..,Dy, .-. of overall area < «, and let —, (x, y) be the character- 
istic function of this system. By what we have proved, 


Lpe(x, y) = L, ls Ps(X, yy} = I, ge(y). 


We let ¢ approach zero; then the functions ¢, (x, y) can be regarded 
as forming a decreasing sequence; consequently the functions. 
g-(y) = I, 9,(%, y) also form a decreasing sequence. Since I », > 0, 
we have I, g,(y) -> 0, and it follows that for almost all y, g,(y) > 0 
too. 

We denote by #, the set of all y for which g,(y) > 0; this set. 
has full measure. But as we have shown already, each value of the 
function g,(y) is the sum of the lengths of the intervals formed by 
the intersection of the corresponding horizontal line y = y) with 
the rectangles D,, ...,.D,, ..., which cover the set CZ. We see 
that the intersection of the set CZ with the line y = yy) E Eo. 
can be covered by a (countable) system of intervals of arbitrarily 
small overall length. It therefore has measure zero, and the inter- 
section of the set Z with the same line has full measure, as required. 

We turn now to the general case. 


71* 


184 MATHEMATICAL ANALYSIS 


The sequence h,, converges to g on a set # of full (plane) measure; 
on the y-axis, a set Ey of full (linear) measure can be found such 
that for yy € Ey the functions h, (x, yy) converge to (x, yp) on a 
set Eye of full measure in x. To construct the set H#, on which the 
sequence g,(y) converges, we can exclude in advance points not 
contained in Ey (they form a set of measure zero), and we can 
therefore suppose that Z, < E,,. Further, for a fixed y, € Ey, in 
constructing the set Hy, on which the sequence h(x, y)) con- 
verges to the function (x, ¥), we can exclude points not con- 
tained in Ej, and therefore assume that Ey, c Ej,,. But on the 
set Biya we have h,(%, Yo) A(x, Yo). Hence p(x, Yo) = Mo(%; Yo) 
everywhere on #,,,. Thus we find that for y € H, the function 
(x, y) is summable with respect to 2, its integral I, p(x, y) = g(y) 
is a summable function of y, and 


I, {Lz 9 (x, y)} = Ip. 
The proof of Fubini’s theorem is therefore complete. 
Note. It does not, in general, follow from the existence of the 
integrals 
Ly {Ta (%, Wy, In {ly pe, 9} (2) 


either that they are equal or that the function (x, y) is summable 
on the set H. But if the function p(x, y) is measurable and non- 
negative, then the existence of one of the integrals (2) entails its summa- 
bility on E, and the equation 


Ip = 1, Uz 9} = L(t, 9}. (3) 


For let there exist, say, I, {L,@(#,y)} =A and let (a, y) 
== min {y(x, y), n}. The function y(x, y) is measurable, bounded, 
and consequently summable on #; by Fubini’s theorem 


Ign = Iy{legn} SA. 


As n increases, the functions , increase monotonely and tend to 
the function gy; since I y, < A, the function » is summable, by 
Beppo Levi’s theorem. But then we can apply Fubini’s theorem, 
and we get (3), as required. 

Example. If y(x) is summable fora Sz <b and y(y) is sum- 
mable for a Sy <b, then g(x) p(y) is summable on the set 
H=(aseSb,a Sy <6) and 


Ivy) = Ip ly. (4) 


THEORY OF THE INTEGRAL 185 


For the function || ]y| is evidently measurable and the inte- 
gral I, {Lz(|p] |p|)} = Ly {(1v| Z2(1¢|)} = Ly (vl) Ze (1¢|) exists; the 
function |p p| is therefore summable on FH, and together with it 
yy. Applying (3), we get (4), as required. 

3. The space L,. Let p be a positive number. L, is defined as 
the class of all functions f(x), defined on a region G, for which 


I(|f\?) = f [fle dz < 0. 
G 


For any p > 0, L, is a linear space. For clearly, for any «, « f be- 
longs to L, together with f. Further, if fELZ,, g € Ly, then 
j+9€Z,, since 


lf+9lP S (fl + lol? S [2 sup (f], |g]? = 2? Eup (fl, lg). 


We shall show that for p => 1 this linear space can be converted 
into a complete normed linear space by introducing the norm 


1 f 

Iflp = Uf l)Ie. (1') 

We have already substantiated this claim for the case p = 1 

(Section 3). We therefore restrict ourselves to the case p > 1. 

It is evident that the norm (1’) satisfies the condition ||f||) = 0, 

and ||f||, = 0 implies that f(x) = 0 almost everywhere; clearly 

also |x f|, = |x| ||f| p for any real «. The triangle inequality 
presents more difficulty. We consider first the following lemma: 

Lemma. If 4 = @(&) is an increasing continuous function, 

w(0) = 0, and & = u(y) the inverse function (obviously also con- 

tinuous and increasing), then for any x > 0, y > 0, 


x y 
ay S f w() dé + f wm) dy. 
0 0 
The proof is obtained quickly by considering Fig. 10. 


1 
To be particular, let us put w(€) = €P-!(p > 1), u(y) = p —— 


p-—1 
then we get 
uP yt 
ge (1) 
4 Pp q 
The number g here is defined by the equation 
1 Pp 1 


q= fad (2) 


po)” Leip" 


186 MATHEMATICAL ANALYSIS 


so that 

1 1 

ae ee 

P 
Applying inequality (1) to functions f(x), g(x), we get 

el la@| sZOP , Wer | (3 
Pp q 
q] 
A 
x 
Fie. 10 


Let us suppose for the moment that ||, = I/?({f|?) =1, 
lglg = I/7(\g|%) = 1. Then inequality (3) shows that 


1 1 
I ant RE Wd 
(lig So +5 


Now if f€L,,g€ L, are arbitrary functions, then for functions 


lo = an eee the conditions || f9||, = ||go|q = 1 are satisfied ; 
fly" Ilgla 
it follows that 
T(lfgl) 
I ome ALE 
(ose) = Tri tale © 


and hence for any f € L,, 9 € Ly; 


T(lfg|) SI flplgt, 
(Hélder’s inequality). 
We suppose now that f € Z,, g € L,, and obtain a bound for 
|f + gllp. We have 
lf +g S(If] + lg)? 
= |fl (fl + lg)? + lol (Al + gle (4) 


THEORY OF THE INTEGRAL 187 


The number p — 1 is equal to 2 [as is evident from (2)]; hence 


(fl + lgie* = (AL + lol € Z,, 
and consequently 
LAL + lo leAla = PLA + lg). 
Integrating (4) and applying Holder’s inequality, we get 
Tlfl + lo)? SU fly MUFL + lg DP-*le + dole A + [glee 
= (Iflp + lol) Pt fl + Ig. 


Dividing by I/9[(|f| + |g|)?], and remembering 1 -=- 
we find: 
If +o], = Pre + ole sr dil + lol? Siflp + late: 


Thus, for p > 1, the triangle inequality is satisfied. 

It remains for us to verify that the space LZ, is complete. This 
is achieved in an analogous way to that for the completeness of 
the space L = L,. If y,, y,, ... is a fundamental sequence in the 
space L,, a subsequence yp, = yx can be extracted from it so 
as to satisfy the conditions 


1 
Pera — Yelp < Ge 


If the region G is bounded, then by Hélder’s inequality 
T()Yes1 — Yl) SU lg lPeet — Yelp SO 2-*, 


from which follows, just as for p = 1, the convergence (almost 
everywhere) of the series 


oO 
> (Prar — Pr) 
kel 


and the existence (almost everywhere) of a limit function 


g = lim y,. In the case of an infinite region G, integration over 
k-00 
an arbitrary bounded subregion also ensures the existence of a 


limit function almost everywhere on G. 
Since |p|? = lim |y,|? and the I(|y,|?) are bounded by a num- 
ber M, say, it follows that g € L, and therefore T(\@|$) <M. 


188 MATHEMATICAL ANALYSIS 


Just as in Section 3, we get a bound 


I(\@n—-Gnl?) S Sup I (lex — Gn|?) Se 


for sufficiently large , whence it follows that |}~ — gnllp > 0, 
completing the proof. 

It is proved exactly as in Section 3 that the totality of con- 
tinuous functions on the closed interval [a,b] is dense in the 
space L, with respect to the metric of that space. 


Concluding remark 


The first work done by A. Lebesgue (French mathematician, 
1875-1941) on the theory of measure and of the integral dates 
back to 1902. Originally designed to solve problems in the theory 
of functions of a real variable (the original problem concerned the 
convergence of trigonometric series), Lebesgue theory very soon 
exceeded these narrow confines. The Fisher-Riesz theorem on the 
completeness of the space of summable functions (1907) and the 
introduction by Riesz of the space LZ, ultimately rendered the 
Lebesgue integral indispensable in general contemporary problems 
in analysis and mathematical physics. F. Riesz (Hungarian mathe- 
matician, 1880-1956) also proposed the scheme of construction 
for integral theory adopted in our exposition. Our recommended 
literature is: H. L. Lebesgue: Legons sur Pintegration et la Recherche 
des functions primitives, Gauthier-Villars, Paris (1950); F. Riesz 
and B. Szekefalvi-Nagy, Functional Analysis, London (1956). 


CHAPTER V 


GEOMETRY OF HILBERT SPACE 


1. Bastc DEFINITIONS AND EXAMPLES 


1. A linear space H (under multiplication by real numbers) is 
said to be a (real) Hilbert space if: (1) there is a rule which maps 
each pair of points (vectors) of H onto a real number, called the 
scalar product of the vectors x, y and denoted by (x, y); (2) this 
rule satisfies the following requirements: 

(a) (y, x) = (x, y) (transposition law); 
(b) (a, y + 2) = (x, y) + (a, 2) (distributive law); 
(c) (Ax, y) = A(x, y) for any real A; 
(d) (x, x) > 0 for x + 0 and (2, x) = 0 for x = 0. 
From axioms (a)—(c) we easily obtain the general formula 


k m k m 
(Sain, Sh w)= 3 SarBjleny), (1) 
tol J=1 tel jel 


which holds for nae) vectors ©, --, Ue, Yy) +> Ym and arbi- 
trary real numbers «,, ..., &, By, «+» Bm- 

Examples 1. In the n-dimensional space R,, the elements of 
which are arrays x = (€,, ..., &,) of real numbers, we introduce 
a scalar product for vectors « = (&, ..., €:), ¥ = (M1, «+» Mn) by 
the formula ; 

(2, y) = &, nm a a &, Yn: (2) 


(This definition generalizes the well-known expression for the 
scalar product of vectors in three-dimensional space in terms of 
their coordinates relative to an orthogonal coordinate system). 
The reader will easily verify that requirements (a)—(d) are satis- 
fied. A finite-dimensional (real) Hilbert space is usually called a 
Euclidean space. 

2. The following example represents the infinite-dimensional 
analogue of the foregoing. 


MA. Ta 189 


190 MATHEMATICAL ANALYSIS 


The space J,. A sequence of numbers # = (&,, ..., &, «.-) is an 
element of this space if the sum of their squares converges: 


wo 
Do Os 
n=1 


Linear operations are defined in the natural way: 


(E45 e005 Ens cee) HE Cys coer Mrs vee) = (Ex bas ees Ga Ene oe) 
OE 4 5. 3045 Spann) = (CS Gy ee as. oss). 


The scalar product of vectors x = (&,, ..., &n, ...), y = (M1, -, 
Nn; ---) is given by the formula 


(x, y) - 3 Ean: (3) 


We have to verify the correctness of these definitions. In the 
first place it follows from the elementary inequality 


E.tn S4(ER +78) 


that the series (3) always converges for x € 1, y € 1,. Further, the 


equations 

nm+m n+m 

re (a &,)? aa oe Ei, 
nem n+m n+m nim 
Oy (Ee + IK)" = 2 ER + opel Si Ni ae Nk 


=n 


show that for x € l,, y € [,, the series 
oe (KE, DY (Ex +)’, 
kel k=1 


converge, and hence the operations we have defined for the ad- 
dition of vectors and their multiplication by scalars are feasible 
in the space J,. 

It remains for us to verify that axioms (a)—(d) are satisfied; but 
a cursory glance suffices to convince us of this. 

3. As we have seen (Chapter IV, Section 5), the space L,(a, b) 
of functions g(x) with a summable square on the closed interval 
ax <6 is a linear space; we introduce a scalar product on it 
by the formula 


b 
(yy) = f ole) ye) de. 


GEOMETRY OF HILBERT SPACE 191 


The fulfilment of the axioms for a scalar product follows here from 
the properties of the Lebesgue integral. We observe that the 
integrability of the product y y follows from the inequality 


Ip (x) p(x)| S d(\p (x)? + [p(x))?). 


2. Isomorphism of Hilbert spaces 


The definition of an isomorphism between Hilbert spaces is 
formulated in analogy with the definitions of equivalence of sets, 
isometry of metric spaces, and isomorphism of linear spaces, 
familiar to us from the first and second chapters. Two Hilbert 
spaces H', H" are said to be isomorphic if there exists between 
them a one-one correspondence with the following properties: 

(1) If vectors x’, y' in the space H” are the images of wz’, y' in H’, 
then the vector 2!’ + y" € H” is the image of 2’ + y' € H' and 
the vector « x” € H" is the image of « x’ € H’ for any real «. 

(2) Under the same hypotheses, the numbers (w’, y’), (w”’, y’’) are 
equal. 

We shall show subsequently (Section 2, art. 3) that any two 
finite-dimensional Hilbert spaces of equal dimensionality are iso- 
morphic to one another (and therefore isomorphic to the space R,, 
of example (1). The spaces L, (a, b) and J, are also in fact isomorphic 
to one another (Section 2, art. 6)). 


3. Length of a Vector and Angle between Vectors 


The scalar product permits us to introduce the concepts of the 
length (norm) of a vector and of the angle between vectors in 
Hilbert space by the formulae 


|x| = +V@,2), (1) 
PON, a Wd) 9 
00s (4) = Ter aT 4 


these definitions agree with the usual formulae of analytic geo- 
metry. We shall often simply write |x| instead of |x|. 

Let us consider these definitions in general Hilbert space. We 
shall prove that, whatever the vectors a, y, the ratio on the right- 
hand side of (2) cannot exceed unity in absolute value. 


Ta* 


192 MATHEMATICAL ANALYSIS 


For the proof of this assertion, we consider the vector Ax — y, 
where A is a real number. In virtue of axiom (d) we have for any A 


(Ax — y, Ax ~ y) 2O. (3) 
Using formula (1) of art. 1, we can write this inequality in the form 


The left-hand side of the inequality is a quadratic trinomial 
in A with constant coefficients. It cannot have distinct real zeros, 
since then it would not preserve its sign for all values of A. Its 
discriminant (x, y)? — (x, x) (y, y) cannot, therefore, be positive. 
Hence (x, y)? S (x, x) (y, y) and taking the square-root, we get 


(zy)! S lel ly, (5) 
as required. 


We determine when the equality sign in (5) is possible. If 
Iz, y)| = lel yl, 


then the discriminant of the quadratic trinomial (4) vanishes, and 
the trinomial therefore has one real zero 4). Thus we get 


AB (x, x) —s 2Ag (2, y) + y, y) a (Ay x —Y, Ay & ~~ y) = 0, 


and in virtue of axiom (d) we find that Aja — y = 0, ory = Aga. 
Our result is capable of a geometrical formulation: if the scalar 
product of two vectors is equal in absolute value to the product of their 
lengths, then the vectors are collinear. 
The inequality (5) is termed the Cauchy—Bunyakovsky inequality. 
Examples 1. In the Euclidean space £,, the Cauchy—Bunyakovsky 


inequality has the form 
s|/ 38) Su: (6 
Jel jel 


it holds for any pair of vectors x = (&, ..., En), ¥ = (M1) ++) Nn), 
or what is the same thing, for any two systems of real numbers 
$1, &o, «+, &, and 1, M2, --+s Yn. (This inequality was discovered 
by Cauchy in 1821.) 

2.In the space L,(a, 6b) the Cauchy—Bunyakovsky inequality 
becomes 


oy SN; 
j=l 


b b 
< |/ fereoas |/ freee: (7) 


b 
fp) ple)de 


GEOMETRY OF HILBERT SPACE 193 


it holds for any pair of square-summable functions p(x), p(x). 
(This inequality for Riemann-integrable functions was discovered 
by V. Ya. Bunyakovsky in 1859.) 

We now proceed to investigate the properties of the norm. 

It follows from axiom (d) that each vector x of a Hilbert space 
# has a norm: any non-zero vector has a positive norm, and the 
norm of the null vector is zero. The equation 


jae] = Vax, 02) = VaXa, x) =lalV(a, 2) =|a||al (8) 
shows that the modulus of a scalar multiplier of a vector can be 
carried through the norm sign. Finally, the norm satisfies the triangle 
inequality 

|e + y| Slax] + ly. (9) 
For by the Cauchy-Bunyakovsky inequality, 
ja ty = (w+ y," + y) = (w, 2) + 2(e,y) + yy) 


and. (9) follows on taking the square root. 

Thus the norm in the space H satisfies the axioms for a normed 
linear space (Chapter II, Section 8). All the metric concepts and 
properties connected with the existence of a norm are valid in Hilbert 
space. But since Hilbert space is a very particular instance of a 
normed space, it is only reasonable to expect that the norm in 
Hilbert space will possess properties peculiar to Hilbert spaces. 
One such property is given by the following lemma: 

Parallelogramm lemma. For any two vectors x,y of a Hilbert 
space HH, 

|e + y+ |e — y? = 2x? + 2ly? 


(the sum of the squares of the diagonals of a parallelogram is 
equal to the sum of the squares of its sides). 
The proof follows from the simple computation: 


Je -+y?+|e-—yP=@+yety)+@—y,x—y) 
= 2(x, x) + 2{y, y) = 2]a)? + 2ly|? 
Problem. It is known that the parallelogram lemma holds in 
some normed space # for any pair of vectors x, y. Consider the 


function 


(ey) = (ie + yl? — |e — yP) (60 that (x, «) = |xP) 


194 MATHEMATICAL ANALYSIS 


and prove that it satisfies all the axioms of art. 1 for a scalar 
product. 

Hint. To show that axiom (b) is satisfied, apply the parallelogram 
lemma to parallelograms constructed on the vectors: (1) « + 2, y; 
(2) « —z,y; (8) y +2, %; (4) y —2, a. Verify axiom (c) first for 
integral 4, then for fractions, and then pass to the limit. 


4. The Limiting Process in Hilbert Space 


Together with the metric, there arise in Hilbert space concepts 
connected with passing to a limit. We say that a sequence 
Wy, «+, X,, +» of elements of a Hilbert space H converges to an 
element (or has an element x as a limit) if 

lim |% — 2,| = 0. 
n~- oo 

A function f(x), defined on the space H, is said to be continuous 
at a point x if x > x implies f(x) > f(x). Continuous functions 
of two or more variables are defined similarly. 

By way of example, we show that the scalar product (a, y) is a 
continuous function of both variables a, y, i.e. if %, > %, Y, Yn > Y, 
then (%,, Yn) > (&, y). 

We put y¥~Yy,=hn, & ~X, =k,; by hypothesis kh, > 0, 
k, > 0. By the Cauchy—Bunyakovsky inequality 


| (x, y) ~~ (ny Yn) | 7 |(z, y) _ (% —o kns ¥ _ hp} | 
= | (x, hy) -F (y, kn) => (kn; h,)| 
S [al lanl + [yl [eal + [eal [als 


as 7 increases, this quantity tends to zero. It follows that 
(x,y) = ass (Xn; Yn), a8 required. 


A eéquenes {z,} € H is said to be fundamental if 


lim |, —%»| = 0. 
n,m—->oo 

The space # is said to be complete if every fundamental sequence 
has a limit in it. 

All three spaces considered above: R&,,, 1,, L, are complete. 

The proof of the completeness of the space #,, is elementary 
(cf. Chapter II, Section 4). We proved that the space L, was 
complete in Chapter IV, Section 5, art. 3. We shall show now that 
the space J, is also complete. 


GEOMETRY OF HILBERT SPACE 195 


Let ty = (€@, ...,E™, ...) (m = 1,2, ...) be a fundamental 
sequence of vectors of the space /,. Since, by hypothesis, as m and 
p tend to infinity 


J2m — pl? = SES? - EP +0, 


in particular each term |&”) — &()|? (for fixed m) will tend to 
zero as m and /» increase without bound, and therefore by Cauchy’s 
classical criterion the sequence of coordinates £°”) (m = 1, 2, ...) 
will converge for each fixed n. We denote = lim é) and show 


m-~> 00 
that the vector « = (&, ..., &,, -.-) belongs to J,. 
Since the elements of a fundamental sequence have a common 
bound, we have 


|@m|? = 5 le? sk, 
where K is independent of m. Hence for any fixed NV 


N N 
f= lim SEM SK; 
n=1 


m>on=l1 


co 
and the convergence of the series 5/ & follows immediately. 


n=1 
It remains to show that ||x — z,,|| tends to zero as m > oo. 
To do this, we take the limit as p > oo in the inequality 


N 
alee — &PP se, 
as 


which for a given e > 0 holds for sufficiently large m, p and any N. 
As a result, we obtain the inequality 


N 
ES — &.)* Se, 
n= 
and by passing to the limit as V > oo, we get 
ce 
| @n — ai]? = & le” — §&,? Se, 


holding for all sufficiently large m, as required. 

If the space H is incomplete, then as we saw in Chapter II, its 
completion can be constructed. The elements of the completion 
are all the possible classes of equivalent fundamental sequences. 


196 MATHEMATICAL ANALYSIS 


It was proved in Chapter ITI that linear operations and a norm can 
be naturally introduced into the completion of a normed linear 
space so as to make it again a normed linear space. If H is a Hilbert 
space, a scalar product can also be naturally introduced into its 
completion. For let classes, X, Y be given, and let us take any 
sequences {x,,} € X, {y,} € Y. We contend that there exists a limit 
of the expression (%,,, Yn) a8 2 —> 0. 
For we have: 


| (n> Yn) —_ (ms Ym) | = l (tn, Yn —_ Ym)| + | (tp — Xm, Ym) | 
= |p| |Yn _ Ym| + |%n — Lm| |Yml|- 


The numbers |2,|,|¥,] are bounded since the corresponding se- 
quences {z,}, {y,} are fundamental in H. Hence the sequence 
(%n, Yn) satisfies the usual Cauchy criterion and therefore has a 
limit, as asserted. This limit is independent of the choice of se- 
quences {x,}, {y,} in the classes X, Y. For if {x} € X, {y,} € Y, 
then 


(ah Yn) — (fn, Yn) | = | (an, — &y, Yn) + (fn; Yn — Yn) | 
S |e — tr] |Yn| + [el Yn — ynl >, 


and since y;,, 2, are bounded, {z;,} is equivalent with {x,} and 
{yn} with {yn}. 

It is easily verified that axioms (a)-(d) are satisfied, and we no 
longer dwell on the details of verification. 

Thus the completion of a Hilbert space is again a Hilbert space. 


Problems. 1. Let d,, d,, ... be a fixed sequence of numbers such that 


foe] 
>» é,d, converges for any sequence (&,) € 1,; show that (d,) € [,. 


n=1 


CO 
Hint. Show first that d,— 0. Further, suppose that J’ d2 = o, and 


n=l 


ny -i d, 
consider the arrays (d,,5 ---> Gn,43)> for which 1 S$ oy d2 <2. Put &, = z= 


v9 Ong ee 
nam, if 
fee] 
for n, <n < ny,, and show that (&,) €i,, but D? &,d, = 0. 
n=l 


2. Let c,, Cy, ... be a fixed sequence of positive numbers; show that the 
set Mf of all elements (&,, ..., &, »-) € , with |&,| < c2 (n = 1, 2, ...) is locally 
foo) 


compact if and only if 3’ c? < oo, 
nal 


GEOMETRY OF HILBERT SPACE 197 


oO 
Hint. lf 3 c% < e, the set of elements (€, ..., €,, 0, 0, ...) with |é,| <c, 


ne N+1 
foe] 
forms a locally compact e-net for M. For 3 c2 = co consider the arrays 
“ ee 1 bee 
(Capo +> Capa,) for which 2 c2 > 1; the corresponding elements (0, ..., 


nz ry 
0C, ...,C,_19, ..) belong to M and are mutually separated by distances 
greater than & + 1. 


3. Prove that a set M of elements (&, ..., &,, ..-) € l, is locally compact if 
and only if: (1) all the numbers |é,| are bounded by a fixed constant; (2) all 


tee) 
the series 3’ £2 converge uniformly on M, ie. for any e < 0 a number NV 
n=l 


fo) 
can be found such that D' & <e for all (£,) ¢ M. 
¥ 
Hint. Use the method of problem 2. 


2. ORTHOGONAL RESOLUTIONS 
1. Orthogonality 


Vectors x, y are said to be orthogonal if (x,y) = 0. If «+0, 
y + 0, then in accordance with the general definition of the angle 
between two vectors (Section I, art. 3) this definition means that x 
and y form an angle of 90°. The null vector is orthogonal to any 
vector x € H. 

In the space L, (a, b) the condition for orthogonality of vectors 
v(x), p(x) has the form 


b 
f pe) pa) dz =0. 


The reader will easily verify, on computing the corresponding 
integrals, that in the space L,(—z,2) any two vectors of the 
“trigonometric system ”’ 


1, cos x, sin x, cos 22, sin 22, ..., cos nx, sin nx, ... 
are mutually orthogonal. 


We note a few simple properties connected with the concept of 
orthogonality. 

(1) If a vector z is orthogonal to the vectors y,, Yg, ---» Yx, it is 
also orthogonal to any linear combination o, y, + + + OY, Of 


198 MATHEMATICAL ANALYSIS 
these vectors. For we have 
(4 Yp bo + Oe Yer X) = Oy (Yy, LZ) + vv + OK (YR, %) =O. 


(2) If the vectors y,, Yo, ---> Yn, «» are orthogonal to a vector x 
and y = lim y,. then the vector y is also orthogonal to 2. 


n-» 0o 
For in virtue of the continuity of the scalar product 


(y, 2) = lim (y,,%) = 0. 


It follows from properties (1) and (2) that the totality of vectors 
orthogonal to a vector wx (or to an arbitrary fixed set X of vectors x) 
constitutes a closed subspace the orthogonal complement of the 
vector x (set X). 

(3) Pythagoras’ theorem and its generalisation. Let the vectors 
x, y be orthogonal; then by analogy with elementary geometry the 
vector « + y can be called the hypotenuse of the right-angled 
triangle constructed on the vectors x, y. Forming the scalar pro- 
duct of x + y with itself and using the orthogonality of the 
vectors x, y, we get: 


jo +y|?=(@+y,x+ y) = (x, x) + 2a, y) + (yy) 
= (a, 2) + (y, y) = |a|? + ly. 


We have thus proved Pythagoras’ theorem in general Hilbert 
space: the square of the hypotenuse is equal to the sum of the squares 
of the orthogonal sides. It is not difficult to generalise this theo- 
rem to the case of any finite number of terms. Let the vectors 


1, Lg, ».., % be mutually orthogonal and let z= a, +a, + - 
++ %,; then 
[2 |% = (ay tose bey Wy vee + My) = [Hy]? + oe + fry? 


2. Method for Orthogonalisation 


In order to obtain an orthogonal system of vectors, a method 
of orthogonalising a given non-orthogonal system is often employed. 
The procedure is as follows. Let there be given a sequence of 
vectors %, 2, ...,Xp,, --. in which each finite subsystem 21, Xp, .--, Un 
is linearly independent. We claim that a special choice of the coeffi- 


GEOMETRY OF HILBERT SPACE 199 


cients @;, in the formulae 


Y= %, 
Yo = %q + Aq %y, 
Ys = Uz + Aga % + Mg, %y, (1) 


enables us to construct a system of non-zero, mutually orthogonal 
system of vectors ¥,, Ya, «+, Yn. The formulae (1) with the appro- 
priate coefficients a;, are called the formulae of orthogonalisation. 

The existence of a solution to this system, subject to the required 
conditions of orthogonality, is easily proved by induction. For let 
us assume that non-zero, mutually orthogonal vectors y¥,, ---, Yn 4) 
satisfying the first a — 1 equations of the system (1) have been 
constructed; we shall show that a vector y, can be found, satisfying 
the nth equation of the system (1) and orthogonal to the vectors 


Yrs ++) Yn. We shall look for the vector y, as a linear combination 
of %,, %, .., %, in the following special form: 

Yn = bay 4, to + bn n-a Yn-1 + %n> (2) 
where y,, --., Yn_, are the vectors already found, and b,, .--, Onsn—4 


are coefficients which have to be determined. 

Multiplying equation (2) scalarly by y, (k < ) and using the 
supposed orthogonality of y, to Yy, --) Yeors Yeris +» Yn-1» We 
get 


(Yns Ye) = (Ln> Yr) + One (Yrs Yx)+ 


Equating the right-hand side to zero, we get an equation in the 
coefficient 6b,;, which is soluble since by hypothesis (y;,, y,) ++ 0. 
When all the coefficients b,; (k = 1,2, ...,71) have been found 
in this way, equation (2) allows the construction of the vector y,. 
By construction it will be orthogonal to each of the vectors 
Ya> Yoo -+> Yn-1; it only remains for us to show that y,, +: 0. For 
this we substitute the expressions for yy, ---, Yn, of the first 
n ~~ 1 equations (1) in the nth equation; we get a linear expression 
for y, in %, .--, %-1; Un in which the coefficient of x, is equal to 1. 
If y, were zero, we should have a linear dependence between 
21, -+)%, Which by supposition is impossible. It follows that 


200 MATHEMATICAL ANALYSIS 


Yn +: 0, as required; the correctness of the method of orthogonali- 
sation is thus substantiated. 

The orthogonal system of vectors 1, Y%:, -+; Yn, «- obtained 
can be further ‘‘refined’”’ by dividing each of the vectors y, by 


its norm |y,|; the resulting system of vectors e, = wl is not 
n 


only orthogonal, but also normed in such a way that each vector 
has a unit norm. Such systems of vectors are said to be ortho- 
normal, 


Problems. 1. Let a system of vectors 21, %, ...,%,, -. be given. Prove 
that there exists only one (discounting numerical multiples) system of 
vectors 41, --» Y_. -- Satisfying the conditions: (a) (2%;, y,) = 0 for 7 < k; 
(b) for any » the vector y, is a linear form in %,, %, ..., %,, ++ 

2. The polynomials obtained in orthogonalising the functions 1, a, 2, ... 
over the space L,(—1, 1) are termed Legendre polynomials. Show that the 
nth Legendre polynomial has the form 


PD, (#) = C,[(a? — 1). 
Hint. Use problem 1. 
3. The functions obtained in orthogonalising the expressions re~*", xe~**..., 
x" e-*", ... over the space D,(— co, co) are termed Hermite functions. Show 
that the nth Hermite function has the form 
Ga (x) = C,, ext? [e~ al ad 


4. The functions obtained in orthogonalising the expressions e~*, xe7*, 
ze” *,... over the space L,(0, co) are termed Laguerre functions. Show that 
the nth Laguerre function has the form 


D,, (x) = C,, e- 2 [a e- 2] 
3. The Isomorphism between Two n-dimensional Euchidean Spaces 


We can now prove a theorem on the isomorphism of any two 
Euclidean spaces of the same dimension. For the proof of this 
theorem we construct in each of the spaces Rj,, R;’ an ortho- 
normal basis: e, ..., €, in R, and e’, ..., e in Rj’, orthogonalising 
by the method of art. 2 an arbitrary linear independent system 
in each space. Further, we map an arbitrary vector a’ € Rj, witha 
decomposition, say 


rm 
w= Sd} ej, 
ja 
relative to the basis e,, ..., e;, on to the vector x’ € Ry’ which has 
the decomposition 
n 
tf ‘r 
et oy fe 


GEOMETRY OF HILBERT SPACE 201 


with the same coefficients &,, ..., &, relative to the basis ey’, ..., e,. 


It is clear that this map is one-one and that it preserves linear 
operations. We shall show that the scalar products of corresponding 
vectors in Rj, Ri’ coincide. We put 


nm 
t t 
y = SIH ees 
k=l 
n 


W, 
as Nk ek 3 
kel 


tt 


¥ 
then, since the bases {e;}, {e;'} are orthonormal, we have 
@y=( B85, Bmet) = Zé, 
j=l k=l j=l 
jy") =( 28 e"), Smell) = Sein = 9’), 
j=l k=l j=l 
as asserted. This completes the proof. 


4. Orthonormal Systems in an Infinite-dimensional Hilbert Space H 


In an infinite-dimensional space, infinite orthonormal systems 
are known to exist. We shall call an orthogonal system ¢,, €,, ..-, en; 
complete in the space H, if there does not exist a non-zero vector 
in H orthogonal to each vector of the system. In other words, 
the system ¢,, €:, ..., €,, --- is complete if the conditions 


cEH, (,,%)=90 (n=1,2, ...) 


imply that x = 0. 

We shall show that a complete orthonormal system in a complete 
Hilbert space is a basis, in the sense that for each vector f € H 
there exists a decomposition into a convergent (in the norm) 
series 


p= > 66, (1) 
j=l 
with 
If|? = Dd) G5. (2) 
jel 


For the proof we first find expressions for the coefficients in the 
resolution (1), assuming that it exists. To this end we multiply 
both sides of (1) scalarly by the vector e,. Since the scalar product 


202 MATHEMATICAL ANALYSIS 


is continuous, we get 


(f, €x) = (26 ej, ex} = ( lim Se e;, ex} 


n->oo j=l 


= lim (36 ej, ex} = lim &, = ¢x. 
n>oo \Vj=l nN->00 
We have the formula 

Cr = (f, &)- (3) 


The coefficients c;, defined by (3) are termed the Fourier coefficients 
of the vector f with respect to the system {e,}. We observe that 
these numbers can be constructed directly from the vector f and 
the system {e,}, although we do not yet know whether or not the 
decomposition (1) exists. They have a simple geometric significance 
since 


(fen) = Lf] Jen] €08 (fF &) = [Ff] cos (fe), 


the coefficient c, is the projection of the vector f on the direction 
of the vector e,. 

Let f be a prescribed vector and let e,, ..., ¢, be a fixed finite 
orthonormal system (in general incomplete). We construct the 
vector 


This vector belongs to the subspace H, generated by the vectors 
€1, +++, €,. We also define a vector h by the condition 


f=gth. 


We claim that the vector h is orthogonal to each of the vectors 
€1, €g, +++) @, (and consequently to the whole subspace H,,). For 
we have 


(h, e;) = (6) — G4) = (he) — iz (fs &x) ex, a) 


= (f; é;) _ (f, e;) =0 0) = 1,2, rE 


In geometrical language h is the perpendicular dropped from the 
end of the vector f onto the subspace H,,; and the vector g is the 
projection of f on this subspace. Hence by Pythagoras’ theorem 


12 = |g? + [Re = 3) (fen)? + [hP = DS (fen? 
k=1 k=1 


GEOMETRY OF HILBERT SPACE 203 


The inequality 
Si (hex)? S (FP, (4) 
kel 
which holds for any vector { and any orthonormal system e,, ..., €n 
is known as Bessel’s inequality. If we are given an infinite ortho- 
normal system ¢,, é, .--,@,, +», then since Bessel’s inequality 
holds for any n, we find by passing to the limit that 
foe) 
2 (f, ey)” s 7]. (5) 


In other words, the squares of the Fourier coefficients of any vector f 
with respect to an orthonormal system e,, ..., en, ... always form a 
convergent series. 

We shall also refer to inequality (5) as Bessel’s inequality. 

We now proceed to formulate and prove a fundamental theorem. 

TuEoreM 1. Let a complete orthonormal system e€,, €2, «+, ny + 
be chosen in a complete Hilbert space H. Then for any vector f there 
exists a decomposition 


f= Shee, (6) 


where 
If? = St e)*. (7) 


The last equality, aide ae ates an infinite-dimensional 
generalisation of Pythagoras’ theorem, is generally referred to as 
Parseval’s formula. 

Proof. For the sake of brevity we put (f, e,) = Qn. 

Let s, be the sum of p terms of the series (6) and let g > p. 
Then 


q 2 q 
ls ~ eo =| S954) = > aw, 
pti pti 


As p > ov, this quantity tends to zero in consequence of the con- 
vergence of the series of numbers a;*. The sums s, therefore con- 
stitute a fundamental sequence. Since the space H is supposed to 
be complete, the sums s, have some limit s € H as p> ~, We 
shall show that s = f. To do this, we observe that for fixed k and 
p>k, 


(s, €,) = ee (Sp, &) = lim (3, ej; ex} = lim a, = a, = (f, &); 
poo Vj poo 


204 MATHEMATICAL ANALYSIS 


it follows that for any & 


(f — 8, €x) (f, ex) (8, ex) = 0. 


Since the system {e,} is supposed to be complete, equation (8) 
implies that f = s. 
Thus Sa 

=1 


f= lm s,= D/a¢@;. 
poo j 


Further, in virtue of the continuity of the scalar product, 


lff2=( (AA = Gears Het 8) ae (8p, Sp) 


as required. 


Note 1. If g is any other vector of the space, then by a similar 
evaluation of the scalar product (f, g) we obtain the formula 


foe] 


(7,9) ae (f, &x) (9, &)- (9) 
Note 2. If a,, ..., @,, ... is any sequence of numbers for which 


foe) 
ot, converges, then the series 
k=1 


oo 
a ay Cx» 
kel 


converges in H, as is evident from the beginning of our proof. If 
we denote this sum by /, then as above we shall have a, = (f, en) 
(n = 1,2, ...). Hence any numerical sequence with a convergent 
series of squares is the sequence of Fourier coefficients of some vector 
of the space H. 


5. Criterion for the Completeness of a System 


To apply the fundamental theorem of art. 4 we need a complete 
orthonormal system ¢,, 2g, ..-,@,, --- If it is not immediately 
apparent whether a given orthonormal system ¢,, €, ..., €n, «+ 
is complete in the space H, the following completeness criterion 
can be used: 


GEOMETRY OF HILBERT SPACE 205 


THEOREM 2. A given orthonormal system e,, ..., €,, .-. is complete 
in the complete space H if and only if the linear combinations of 
vectors of the system form a set everywhere dense in H. 

Proof. If the system 

C45 Cg s-2ty Ong ee 


is complete, then in virtue of theorem 1 each vector f € H is the 
limit of linear combinations of vectors of the system {e,}, so that 
the totality of linear combinations of vectors of the system forms 
a set everywhere dense in the space H. Conversely let it be known 
that the linear combinations of vectors of the system {e,} form an 
everywhere dense set in the space H and let the equations 
(9, &) = 0 (& = 1, 2, ...) hold for some vector g. The orthogonal 
complement of the vector g contains all the vectors ¢;, all their 
linear conbinations and the closure of the set of their linear com- 
binations, i.e. the whole space H. In particular (g, g) = 0 and conse- 
quently g = 0. Thus the system {e,} is complete, as required. 

Let us consider the case when the system {e,} is obtained by 
orthogonalising some system {e,}. In accordance with the formulae 
for orthogonalisation each vector e, is a linear combination of the 
vectors %,, ..., Z, and conversely each vector x, is a linear com- 
bination of the vectors e,, ..., €, [ef. equation (2) of art. 2]. Hence 
the totality of linear combinations of the vectors {e,} coincides 
with that of the vectors {x,}. The completeness of the system 
€1> ++) €n, --- Can therefore be established by proving that the 
totality of linear combinations of the original vectors {z,} is dense 
in the space H. 

Example. As we have seen, the vectors 1, cos 2, sina, ... form 
an orthogonal system in the space L,(—z, 2). The linear combi- 
nations of vectors of this system —the trigonometric polynomials — 
form an everywhere dense set in L,(—, 2) (Chapter IV, Section 5, 
art. 3). By theorem 2, the system 1, cos x, sin , ... is complete in 
the space L,(—z, 2) and theorem 1 holds: every function p(x) € Le, 
(—2, 2) resolves into a series in the functions 1, cos x, sina, ... 
convergent in the metric of the space L,(—2, x). It must be 
remarked that the functions 1, cos x, sin x, ... are not normalised: 
itis easily calculated that || 1]? = 22, || cos m x]? = ||sin m a]? = x. 
The normalised system is composed of the functions 


] COs x sin x cos 2x sin 2x 


Rea” ee ae 


206 MATHEMATICAL ANALYSIS 


By formulae (1) and (3) of art. 4 the decomposition we are seeking 
has the form 


(x) 1 1 4 COS & GOR os. 
( “rece ( re) Vn 


PY a] sin x ‘ 
(+ ¥x} Vx 


= 52 [owae + SE [ (@) cos x de + 
+2" [oG)sinede + sai 
1% 


With the notation 


na 


ay == foe) de, an == [ ola) cos med, 


-2 = 


1 A 
bn == [rw sinn aw da, 


we arrive at the usual Fourier series expansion: 


ao 
@ (x) = 2+ > 4 cosna + 6, sinn x. 


n=1 


We now know that this series converges in the metric of L,(—, x) 
for any function (x) € L,(—2, 2). 


Problems. 1. Prove that the system of Legendre polynomials (art. 2, 
problem 2) is complete in the space L,(—1, 1). 

Hint. Use the Weierstrass theorem on the approximation of continuous 
functions by polynomials. 

Note. The system of Hermite functions (art. 2, problem 3) in space 
J,(— 00, co) and the system of Laguerre functions (art. 2, problem 4) in 
space L,(0, co) are also complete. Cf. Chapter VII, Section 3, art. 5. 

2. Construct an orthogonal system of continuous functions e,(z), ..., e (x), 
in the space L,(a, 6) which has the properties: 

b 


(a) f f(x) e,(z) dz = O(n = I, 2, ...) implies that /(x) = 0, for any continuous 
function f(x) 


(6) the linear combinations of the functions e,(z), ..., ¢,(z), ... are non-dense 
in the space L,(a, 0). 


GEOMETRY OF HILBERT SPACE 207 


Hint. Orthogonalise a sequence of polynomials with rational coefficients 
which are orthogonal to a fixed discontinuous function, say the characteristic 
function of the open interval (a, c),a@<c< 0b. 

Note. The result of this problem shows that in order that the condition 
for the system given here to be complete should be satisfied, it is essential 
to presuppose the space H complete. 


6. The Isomorphism of all Countable-dimensional Hilbert Spaces 


We shall call a Hilbert space of countable dimension if there 
exists in it a complete countable orthonormal system. We shall 
show in this article that any two complete spaces of countable 
dimension are isomorphic. 

By theorem 1 of art. 4 each vector f of a space H of countable 
dimension with a complete orthonormal system ¢,, 2, ..., €n, + 
admits a decomposition 

foo) 
f > OF Ch Ck: 
kel 
where c;, = (f, é,) and the series of squares of the numbers c; con- 
verges. On the other hand, as we observed in art. 4, if ¢,, ¢,, ..., 
C,, -.. is an arbitrary sequence with a convergent series of squares, 
then there exists a vector f € H where development in the system 
{e,} has the numbers {e;} as coefficients. 

Using this, we can establish a one-one correspondence between 
an arbitrary complete space H of countable dimension and the 
space J, (Section 1, art. 1, ex. 2), mapping a vector f € H on to 
the sequence c, = (f, én) of its Fourier coefficients with respect 
to a complete orthonormal system ¢,, és, ..., €,, --- Such a map 
evidently preserves linear operations. It also preserves scalar pro- 
ducts since, if 


oo 
f= Dd) a & 
k=l 
foe] 

C= Dy OK es 
k=l 


then by equation (9) of art. 4 
(f, 9) = 4 ay by, 


and it was by just such a formula that we defined the scalar 
product of elements x = (a), y = (by) of the space /,. In short, 


208 MATHEMATICAL ANALYSIS 


we reaffirm that the space [, is complete. We see also that L,(a, b), 
as a complete space f countable diemension, is isomorphic to the 
space l,. Further, any two complete spaces of countable dimension 
are isomorphic to the space |, and are consequently isomorphic to 
one another. 


7. Separability 


Finite-dimensional spaces and spaces of countable dimension 
can be defined as separable spaces, i.e. spaces which possess a 
countable everywhere dense subset. 

For if a space H is finite—or countable—dimensional, it must 
contain a complete finite—or countable—orthonormal system 
€1, &, + By theorem ] of art. 4 the linear combinations of the 
vectors €,, &,, ... form an everywhere dense set in H. If we confine 
ourselves to linear combinations with rational coefficients, we get 
a countable set of elements, dense as before in H; thus H is se- 
parable. 

Conversely, let H be separable and let f,, fp, .-, fn,» -.. be a 
countable everywhere dense set in H. If we orthogonalise the 
elements f,, f,, ..., then by art. 5 we get a complete system, ortho- 
gonal in H, which will be at most countable by construction; 
thus # is of finite or countable dimension. 

We observe that in a separable space H each subspace H' c H is 
also separable. For the proof we fix » and k and select an element 
Qnz € H’, if one exists, which lies in a sphere of radius 1/k with 
centre at a point f, of a countable set f,, f., ... everywhere dense 
in H. We claim that the resulting countable set of elements yp, 
(k,n = 1, 2, ...) is dense in H’. For corresponding to any » € H’ 
and any k we can find some element /,, contained in the sphere of 
radius 1/k, centre y; we are then assured that the sphere of radius 
1/k, centre f, contains elements of the set H’. Hence there exists an 
element 9, € H' for which 


lg i Pnk| < lg = frl an fn = Gnk| < 2/k. 


In particular, any closed subspace H' of a separable Hilbert space H 
contains a complete orthogonal system e,, :, «.. 


Problem. Give a direct proof of the separability of the spaces 
L,(— c, c), £,(0, 09). 


GEOMETRY OF HILBERT SPACE 209 


8. Orthogonal Complements 


We turn to the orthogonalisation theorem of art. 2. In this 
theorem we started from a given system of vectors 7, 2%, +) Bay + 
and constructed a new system y,, Ya, ---; Yn, --- Which was such 
that the vector was a linear combination of the vectors 2,, ..., %n 
and was orthogonal to the vectors 21, ..., %_,. The equation 


en = Yn — Any Xy — Ang Ly — + — An n-1%n-1 = Yn + On 


determines a decomposition of the vector x, into the sum of two 
vectors Jn, Yn, the first of which lies in the subspace H,_, generat- 
ed by the vectors x, ..., %,_1, while the second is orthogonal to 
Hy, ++, %_., and consequently orthogonal to each vector of the 
subspace H,_,. It is therefore natural to call the vector g, the 
projection of the vector z, on the subspace H,_,, and y, the 
perpendicular dropped from the end-point of 2, on to the subspace 
H,,_,- The process of orthogonalisation consists precisely in replac- 
ing the vector x, by the perpendicular dropped from its end-point 
onto the subspace generated by the preceding vectors. As we saw 
in art.2, the existence of this vector can be demonstrated by 
analysing the equations of orthoganalisation. The corresponding 
result essentially uses the finite dimensionality of the subspace 
A,_y.- 

Now let LZ be an arbitrary subspace of a Hilbert space H and 
let f be a vector not contained in ZL. We put the question: can we 
in this case guarantee the existence of a decomposition 


f=9+h, 


where g € LZ and h is orthogonal to each vector in L (we shall say 
briefly: orthogonal to LZ)? It turns out that such a decomposition 
exists under certain conditions on H and L. 

THEoREM. If H is a complete Hilbert space and L <— H is a closed 
subspace, then for any f € H there exists a decomposition 


f=gth, (1) 


where g € Land h is orthogonal to L; moreover g and h are determined 
uniquely by the vector f. 

Proof. We denote d = inf |f — g|. There are two possibilities: 
either d = 0 or d > 0. If d = 0, we can find a sequence g, € L 
such that |f — g,| > 0 and therefore f is a limit point for the 


210 MATHEMATICAL ANALYSIS 


subspace L; but since L is closed, the point f itself must belong 
to L. The decomposition (1) is evidently realised for g = f, kh = 0. 

Now let d > 0. We consider a sequence g, € £ for which 
lf — | -> a. Applying the parallelogram lemma (Section I, art. 3) 
to the vectors « = f — gn, y = f — Ym, we get 


2 
2 oe l9n + Im |*. 

As n,m —> oo the left-hand side tends to 4d?. The first term of 
the right-hand side is not less than 4d? since (gn, + 9m)/2 € L, 
lf — Gn + 9m)/2| 2d. Hence the final term on the right-hand 
side tends to zero and it follows that the sequence g,, is fundamental. 
Since the space H is complete, the sequence g, has a limit g as 
n —» co, which belongs to the subspace Z in virtue of its being 
closed. 

We shall show that the vector h = f —g is orthogonal to L. 
For any q € LZ we have for any A 


Ps |f—(g —Ag))P? =|k + 4a) 
=(h+Agh+iq) =@ + 2a(h, 9g) + Bgl, 
and consequently 


2If — galt + 21f — ani = 4[f - Eom 


24(h, gq) + Ala)? 20, 


but this can only be the case for arbitrary A if (h, g) = 0. 

Thus the decomposition (1) is established. We shall show now 
that the components g, h are uniquely determined. Let us suppose 
that 

f=gth=g +h, 
where g, g' belong to Z and h, h' are orthogonal to L. Subtracting, 
we find: 
O=G-g)+(h—hk’), 


where g — g’ € L and h — h’ is orthogonal to LZ. By Pythagoras’ 
theorem, g — g' =h —h' = 0,80 thatg = g’,h = h’, as required. 

The totality of vectors h orthogonal to the subspace L (including 
the null vector) constitutes a closed subspace M, which is said to 
be the orthogonal complement of the subspace L. 

We have proved that any closed subspace Z € H has an ortho- 
gonal complement M and any vector ¢ € H determines a decom- 
position 


f=gth, gel, hem. 


GEOMETRY OF HILBERT SPACE 211 


Problem. A system of elements f,, f,, ., fn, --. of a Hilbert 
space H is said to be minimal if the vector f, is not contained in 
the closed subspace generated by the remaining vectors for any k; 
it is said to be complete if the closed subspace generated by all the 
vectors f,, fe, ---» fn, --- coincides with the whole space H. 

Systems f,, fe, ---> fr, --- and €,, y, ...,@,, -- are said to be 
quadratically proximate if 


fe.2) 
Dy |he — &|? < 0. 
k=l 


Show that a minimal system f,, fp, ..., fr, --- Which is quadrati- 
cally proximate to a complete orthonormal system e,, €g, --, Cn, « 
is complete (N. C. Barn). 

Hint. We denote by Li.(g) the closed subspace generated by 


vectors gj,» 9p. Let Dd) |f, — ex| <1, and show that LY,,(e) 
N+t 


does not contain a single vector orthogonal to the whole of LY, , (/)- 
Then deduce that any vector x € H can be represented in the form 
y +2, where g € LY (e), 2 € L¥,,(f)- 

The factor space H/L¥,,(f) (cf. Chapter II, Section 8, art. 4) 
will then have dimension N at most. At the same time it contains 
the WN linearly independent vector images of f,, ..., fy; they there- 
fore constitute a basis in H/L¥,,(f). 

The vector z, which is orthogonal to all the f, (k = 1, 2, ...), has 
as its image in H/L%,,(f) the class Z which is orthogonal to the 
images of the f, (k = 1, 2, ..., NW); hence Z = 0, z € LH, ,(f); con- 
sequently z = 0. 


9. The General Form of a Linear Functional in Hilbert Space 


We shall now apply the theorem on orthogonal complements to 
deduce the general form of a bounded linear functional on a com- 
plete Hilbert space. 

Let x, be a fixed vector; for any « we put 


f(x) = (2, %)- (1) 


The functional {(x) is evidently a linear functional on H. It is 
bounded on the unit sphere in virtue of the Cauchy~Bunyakovsky 
inequality 

[f(x)| = |(@, %)| Sl] [xl 


212 MATHEMATICAL ANALYSIS 


We shall show that formula (1) gives the general form of a bounded 
linear functional on the space H. We observe that a bounded linear 
functional is always continuous (Chapter II, Section 9). 

Let f(z) be a bounded linear functional on the complete Hilbert 
space H which does not vanish identically. We consider the sub- 
space H’ c H defined by the equation f/(x) = 0. Using the conti- 
nuity of the functional f(x), it is easy to verify that H' is closed. 
Let H" be the orthogonal complement of the subspace H’. We 
shall show that H"’ is one-dimensional. Let z,,2,€ H"; then 
y = f (2) 2, — f (2) z, also belongs to H”; but 


f(y) = F(@1) f@2) — f@a) f(a) = 9, 
hence y € H’. But y € H’, y€ AH” implies that (y, y) =0 and 
therefore that y = 0. Since f(z,) + 0, f(z,) + 0, the vectors z,, 2, 
must be linearly dependent. Jt follows that H’’ is one-dimen- 
sional. 

Now let e € H” be a normalised vector. Every vector z € H” 
is then of the form 2 ¢; but as we saw in art. 8 every vector « € H 
resolves into the sum # = z+ y (z€H", y € H'). Since z = Ae, 
we have 

t=ahety=(aelet+y. 
It follows that 


f(x) = (a, e) f(e) + Fy) = (@, e) fle) = (@, Fle) e) = (@, a), 
where 2) = f(e) e is a fixed vector of the space H. 

Thus any bounded linear functional on the complete Hilbert space H 
represents the scalar product of the vector x with a fixed vector a). The 
vector av is determined uniquely, moreover, since the identity 
(x, Xp) = (x, 2) would imply that (x, 2 — x,) = 0 for any x; but 
then we should have (x) — 21, % — «,) = 0, followed by a) = 2. 


3. LINEAR OPERATORS 
1. Definition and Examples 


Let & be a linear space. An operator A, defined on the space R, 
is a function which maps each element x € R onto an element 
y = Ax of the same space. 

The operator A is said to be linear if the following conditions 
are satisfied : 

(1) A(x + y) = Ax + Ay for any z,y € R; 
(IT) A(a x) = « Aw for any « € # and any scalar a. 


GEOMETRY OF HILBERT SPACE 213 


From formulae (I) and (IT) we easily obtain the more general 
formula 


A (oy By bo + Oy Be) Oy AM +o +O, A Hy (1) 


for any 2, ..., 2, € & and any real numbers «,, «.., x. 

Examples 1. The operator which maps each vector of a space 
onto the null vector is obviously linear. It is called the null operator. 

2. The operator H which maps each vector x onto itself is evi- 
dently linear; it is called the wnit or identity operator. 

3. A linear operator A which maps each vector x onto 42 
(where / is a fixed number) is called a similarity operator. 

4. Let H be a Hilbert space of countable dimension and let 


€15 Cg, +; Cn, »» be a complete orthonormal system in H. We fix 
a bounded sequence of real numbers A,, Ag, --5 Ans os [An] SC, 
and for any vector 
foo) 
T=1 
we define 
Ava AEG. (2) 
jel 


Since 5} 7? # <e DS & < oo, the operator Aa: is defined by 
formula (2) over the whole space H. It is easily verified that this 
operator satisfies conditions (I), (II). Such an operator will be 
called an operator of the normal form. Each basis vector e, is mapped 
by the operator A onto its magnification by the coefficient 4,,: 


A én = An en- 
5. We fix a bounded measurable function on the closed interval 


[a, 6]. A linear operator can be defined on the space L, (a, b) as 
multiplication by « (x): 


A o(x) = a(x) (2). 
6. Fredholm’s integral operator. In the region 


G=[a<ssbasxzrsgb] 


we fix a function K (x, s), the square of which is integrable over the 
region 
bb 


f [ #@ s)dzds = K? < &. 


MA. 8 


214 MATHEMATICAL ANALYSIS 


We define an operator A on the space L,(a, 6) by the formula 
b 
y(x) = A g(a) = f K(w, s) p(s) ds. (3) 


We shall show that formula (3) does indeed define an operator an 
L,(a, b). Since by Fubini’s theorem (Chapter IV, Section 5, art. 2) 
the function K? (x, s) is summable on the region G, it is a summable 
function of s for almost all 2, which means that K (x, s), as a func- 
tion of s, belongs to L,(a, 6). The integral (3), which represents 
a scalar product of the functions K (x, s), p(s), exists for any func- 
tion v(x) € L,(a, b). By the same theorem of Fubini, the function 
b 
k(x) = f K(x, s)ds, 


is summable with respect to x and 
b bb 
f#@) dz = i [ K(«, s)dads = K?, 


so that k(x) € £,(a, 6). Obtaining a bound now for the scalar 
product (3) with the help of the Cauchy—Bunyakovsky inequality, 
we find: 


6 b 
ly(a)? < f K*(w, s)ds f 9(s)ds = (2) |p, (4) 


so that the function y(z) belongs to the space L, (a, b) as required. 
It is clear that Fredholm’s operator (3) is a linear operator (i.e. 
conditions (I), (IT) are satisfied). 


Problems. 1. Show that the boundedness condition on the numbers 4, 
cannot be weakened in defining an operator A of the normal form (example 4) 
if it is desired that the definition should extend over the whole space. In 
other words, if for some sequence /,, an operator A is defined by formula (2) 
for any vector x € H, then the numbers |/,| have a common bound. 

2. The boundedness condition on the function «(x) cannot be weakened 
in defining the operator A in example 5 if the definition is required to extend 
over the whole space. 


2. Operations with Linear Operators 


Various operations can be carried out on linear operators de- 
fined on a linear space R, resulting in the creation of new linear 
operators. 


GEOMETRY OF HILBERT SPACE 215 


(1) Addition of operators. If linear operators A, B are given, the 
operator C = A + B is defined by the formula 


Cx =(A+ Bx =Axv4 Bua. 


(2) Multiplication of an operator by a scalar. If A is a linear ope- 
rator and / is a real number, the operator B = 4 A is defined by 
the formula 

Bux=(AA)u =)(Axz). 


(3) Multiplication of operators. If A, B are linear operators, the 
operator C = A B is defined by the formula 


Cx=ABx = A(Br) 


(i.e. first operating with B on the vector x, then operating with A 
on the result). 

It is easily verified that new linear operators are obtained as a 
result of all these operations. The usual algebraic laws hold for the 
operations specified: the commutativity of addition, associativity, 
and distributivity (with the exception of commutativity for the 
multiplication of operators). The powers of an operator A are de- 
fined by the natural recurrence formulae 


AL #H, AX =A+A™? (n = 1,2, ...). 
An operator B is said to be the inverse of the operator A if 
AB=BA =H; 


the inverse operator of A is denoted by A-!. If operators C’, D 
have inverses C'-!, D-1, then C'D has the inverse (C_D)-? = D34C-}. 


3. The Norm of a Linear Operator 


We shall suppose that a linear operator A has as its domain a 
normed linear space R. 

The existence of a metric on the space # yields a mapping of 
each linear operator A onto a non-negative number || A|| said to 
be the norm of the operator A. We consider the real-valued function 
F(x) = |A x|, defined for vectors x € R. The norm of the operator A 
is defined as the exact upper bound (possibly oo) of the values of 
this function over the unit vectors w: 


| All = sup [42]. (1) 


An operator A with a finite norm is said to be bounded. 
g* 


216 MATHEMATICAL ANALYSIS 


In x-dimensional Euclidean space the quantity ||.A|{ is finite for 
every linear operator At}. For the length of the vector A « is evi- 
dently a continuous function of its coordinates n,, %,, ..., Rn: 
each of these coordinates is a linear function of the coordinates 
&,, &, -.., &, of the vector x. It follows that |.A | is a continuous 
function of the coordinates &,, &, ..., &, of 2. Since the spherical 
surface |v| = 1 is a bounded closed set in »-dimensional space, it 
follows in virtue of the results of Chapter IT, Section 7 that the 
continuous function |.A 2| is bounded on it. The number | A| there- 
fore exists, since every bounded set has an exact upper bound. 
Moreover, there exists a point x) on the surface |x| = 1 at which 
the function |A | attains its exact upper bound. 

Examples 1. The norm of the null operator is evidently equal to 
zero, Conversely, || A{| = 0 means that the operator A maps each 
normalised vector x, onto zero; but since each vector is collinear 
with some normalised vector x,, Ax =0 for any x. Hence if 
| Al =0, then 4 = 0. 

2. The norm of the identity operator # is equal to unity since 
|# x| = |a| for any vector 2. 

3. The norm of the similarity operator A x = Ax is equal to [A]. 

4, The norm of an operator of the normal form in Hilbert space 
(art. 2, example 4) 


Azx=A (3 g; ai = 2S § je; 
jel j=l 


is equal to the exact upper bound of the numbers |/,,|. For if 


C = sup [4,| and [2|? = 2 & = 1, we have 
jz 


Aa = Saya se S 8 20% 
j=l jel 
and so | Al} < C; on the other hand 
| Al 2 sup |A e,| = sup [An én| = sup [An] = C; 


our assertion follows from these inequalities. 
5. The norm of an operator consisting in multiplication by a 
bounded function g(x) over the space L,(a, 6) (art. 1, example 5) 


{ The case || Aj] = co is possible in infinite-dimensional space. 


GEOMETRY OF HILBERT SPACE 217 


is equal to the number C determined by the conditions 


“e{x:|g(z)| > C} =0, 
B{x:|g(x)| > C —e} > 0 forany ¢e>0. 


(For a continuous function, the number C is equal to max |g(x)].) 
asrsb 
The proof is left to the reader. 


6. For the norm of Fredholm’s operator (art. 1, example 6) with 
a square-summable kernel K (x, s) we can obtain the bound 


bb 
|Al? sf f K*(z,s)dxds, (2) 


after integrating equation (4) of art. 1 with respect to x. 
We consider two simple properties of operators with a finite 
norm. 
(1) For any vector x € R and any linear operator A with finite 
norm || Aj, 
|A 2] S||Al [2]. (3) 


For inequality (3) holds for any unit vector simply in virtue of 
the definition of the norm of the operator A. If x is an arbitrary 
non-zero vector (inequality (3) obviously holds for the null vector), 
then |2| is a unit vector and consequently 


P pee 


ray| £1 4I- (4) 


But since A is a linear operator, we have: 


1 
= —— |4z}; 
[a] [Az 
multiplying inequality (4) by |x|, we get the required inequality (3). 
(2) If A, B are operators with a finite norm, then 


JA+ Bl S|Al + |B), [4B] sl -[4l- (5) 


le] 


4 


For if |x| = 1, then |(A + B)az] =|Ax+ Ba| S|Azx|] + |Bo| 
<||A] + |B], proving the first of the inequalities (5). Further, 
|\ABxu| =|A(Ba)| S| Al |Ba|] s| All| Bl, proving the second 
inequality. 


218 MATHEMATICAL ANALYSIS 


Problems. 1. If A is a bounded linear operator with norm M, then 
sup |(A x, y)|=M (6) 


(the upper bound with respect to all normed vectors x and y). Conversely, 
if the bilinear form (A x, y) is bounded on the unit sphere, the operator A is 
saat and its norm does not exceed the M in equation (6). 

2. If the Hilbert space H contains an orthonormal basis {e,}, then every 
linear operator A can be represented by an infinite matrix ||a,,/|,,.where 


Aeg= E O55 On 
k=1 
For some Mf and any &, «.., &u» M12 > Nas it is given that 


m n 2 m n 
Y 24.8%) SM? LE DS 1k. 
j=l k=l j=l kel 

Prove that A is a bounded operator with norm not exceeding M; con- 
versely, if A is a bounded operator with norm M, the preceding condition 
is fulfilled. 

Hint. The left-hand side is the value of the bilinear form (A x, y) on 
certain vectors # and y. 

3. (continued). Obtain the inequalities 


co foe) 
sup Z Q,sifAjPs LY DL ajy. (7) 
“j=l kel 
Hint. Firstly, || e,|/? = 2 (A ¢;, 6,2 = J a?,. Secondly, if ay = abi ey 
k 


is such that |2)| = 1 and |A ov > || Al] — e, then 
|All —e<[4e/=|284e| 5 VERVE A al? 


=VFF (44, 6)2=VE ZY aij. 
kf kf 


4, An operator of the normal form possesses a bounded inverse if and 
only if the corresponding numbers 4, exceed some positive constant in 
absolute value. 

5. An operator of the normal form is said to be positive if all the A, > 0. 
Show that for a positive operator C of the normal form with a bounded 
inverse for 0 < « < 2/lCll, the inequality || # — « C|| < 1 obtains. 

6. Show that every bounded bilinear functional A(x, y) in Hilbert space H 
can be represented in the form (A x, y), where A is a bounded linear operator. 

Hint, Tf we fix the first argument in the functional (A a, y), we get a 
bounded linear functional of y, which by Section 2, art. 9 can be expressed 
in the form (’, y). Verify that the operator 7 = A x is a bounded linear 
operator. 

7. An operator A* which satisfies the condition (4 a, y) = (x, A* y) is 
said to be adjoint to the operator A. Show that every linear operator A in 
Hilbert space has an adjoint operator. 

Hint. Apply the result of problem 6 to the bilinear functional A(z, 7) 

= (y, A 2). 


GEOMETRY OF HILBERT SPACE 219 


8. Show that || A*|| = | Aj. 

Hint. Use problem 1. 

9. If the operator A possesses a bounded inverse B, then its adjoint A* 
possesses the bounded inverse B*. 

10. Show that the last double sum in inequality (7) does not depend on 
the choice of orthonormal basis {e,}. 

Hint, Tf {f,} is a new basis and A f, = 2 6,, f;, then 


JS Lajea SAG = FTF (Ae, f= =F (%, Ath) 
= D\A* fF 2 Db: ;. 


4, Characteristic Vectors 


A subspace £’ of a linear space R is said to be invariant under 
the operator A if x € R’ implies Aw E R’. 

In particular, the trivial subspaces—the null space and the whole 
space—are invariant under any linear operator; we shall naturally 
concern ourselves only with non-trivial invariant subspaces. 

The one-dimensional invariant subspaces of an operator A play 
a special role. Every (non-zero) vector that belongs to a one- 
dimensional invariant subspace of the operator A is said to be 
a characteristic vector of the operator A; in other words, a vector 
«+ 0 is said to be a characteristic vector of the operator A if it is 
mapped by A onto a collinear vector: 


Aw=)x. (1) 


The number / which appears in this equation is said to be the cha- 
racteristic value (characteristic number) of the operator A corre- 
sponding to the characteristic vector x. 

We consider the examples of linear operators given in art. 1 
from this point of view. 

(1) For the operators in examples 1-3 every subspace is invariant 
and every non-zero vector of the space is characteristic with cha- 
racteristic values 0, 1, A respectively. 

(2) An operator of the normal form (example 4) by definition 
has characteristic vectors ¢,, €), -.-,@n, --- With characteristic values 
Ayo Ag, + Any +» respectively. 

(3) The operator of multiplication by g(2) = x (example 5) has 
no characteristic vectors in the space L,(a, 6) since there is no 
measurable function g(x), non-zero on a set of positive measure 
and such that 


220 MATHEMATICAL ANALYSIS 


(4) The characteristic vectors of Fredholm’s operator (example 6) 
are the solutions of the integral equation 


& 
{ Ke, 8) p(s) ds =Ag(a). 


We shall discuss the existence of solutions to this equation later. 

The set of all characteristic vectors of an operator A with a 
fixed characteristic value 4 evidently constitutes a subspace of the 
space Rf. This subspace is said to be the characteristic subspace 
corresponding to the characteristic value /. 


5. Symmetric and Completely Continuous Operators 


If it is known that a linear operator A in Hilbert space is an 
operator of the normal form, as in example 4 of art. 1, the investi- 
gation of its properties is markedly facilitated. A basis of charac- 
teristic vectors of the operator A determines a unique “coordinate 
system” in which problems connected with the operator A can 
conveniently be solved. 

A necessary condition for an operator A to be reducible to the 
normal form is the equation 


(A x, y) fot (z, A y)> 


which must be satisfied for any 2, y in the space H. For if 
oO oO foe) % 
a= Dee; y= Ding, A®T= Dasa, Ay = Ane, 
j=l j=l jzl J=1 
then clearly 
(Aa y= SA Ein» YA) = Sajid &: 
is je 


so that equation (1) holds. Operators which satisfy condition (1) 
are said to be symmetric. 

The symmetry condition is not sufficient to ensure that the ope- 
rator A is reducible to the normal form. For example, the operator 
of multiplication by x over the space L,(a, 6) is symmetric: 

b 
(ep, yp) = f p(x) pla) da = (9, zy), 
a 
but as we saw above this operator has no characteristic vectors 
and is therefore not reducible to the normal form. In addition to 


GEOMETRY OF HILBERT SPACE 221 


the symmetry requirement, we must impose a further condition 
on the operator A, which we shall call the condition of complete 
continuity : 

From each sequence of vectors A f,,, where the numbers |f,| are 
bounded, a convergent subsequence can be extracted. 

Operators which possess this property are said to be completely 
continuous. 

A completely continuous operator is bounded (and consequently 
continuous): if for some sequence f,,, |f,| = 1, we had | A f,| > ©, 
say |Af,| >, it would be impossible to extract a convergent 
subsequence from the sequence A f,,, in contradiction to the hypo- 
thesis. 


Problems. 1. Is the identity operator Z completely continuous? 
Answer. Not if space H is infinite-dimensional. 
2. A and B are completely continuous operators; show that A + B is 
completely continuous. 
3. A is a completely continuous operator, B is bounded; show that AB 
and BA are completely continuous. 
4. If A and A* are mutually adjoint operators, then 4 4- A*, 4.A*, A¥A 
are symmetric operators, and ||.A A*|| = || A*.A]| = || 4]? 
5. Show that the necessary and sufficient condition for an operator of the 
normal form to be completely continuous is 
lim 4,=0. 
n-»0O 


Hint. Use problem 3, Section 1. 


6. We now proceed to the fundamental theorem on symmetric, 
completely continuous operators. 

THEOREM I. (D. Hilbert) In a complete separable Hilbert space 
every symmetric completely continuous operator possesses a compete 
orthogonal system of characteristic vectors. 

We shall carry out the proof of this theorem in several stages, 

Lemna 1. If |e| = 1 and A is a symmetric operator, then 


[Act <] Atel, 
with the equality sign possible only if there ts a characteristic vector 
of the operator A? with the characteristic value 
A= |Ael?. 
Proof. In virtue of the symmetry of the operator and the Cauchy— 
Bunyakovsky inequality, we have: 
[Ae]? = (Ae, Ae) = (A®e, e) <|A%e| le] = [Ate]. (1) 


MA. 8a 


222 MATHEMATICAL ANALYSIS 


The Cauchy—Bunyakovsky inequality reduces to equality only if 
the vectors which figure in it are collinear (Section 1, art. 3), hence 
in the case of equality we have 


Ate =jhe, 


ie. e is a characteristic vector of the operator A*. Substituting 
this expression in (1), we get for A: 


(A2e,e) = (Ae, e) =A = |A el’, 


as required. 

We shall call the maximal vector of a bounded operator A the unit 
vector ¢, |e| = 1, on which the quantity |A e| attains its greatest 
value M = | Al]. In general, not every bounded operator will have 
a maximal vector. But we shall show that a symmetric, completely 
continuous operator always has a maximal vector: 

Lemma 2. A symmetric, completely continuous operator possesses 
a maximal vector. 


Proof. We choose a sequence y, = Aa,, where |a,| = 1 
(n = 1,2, ...) such that lim |y,| = Jf. By hypothesis, a conver- 
n>» 0O 


gent subsequence can be extracted from the sequence y,,; deleting 
the remaining vectors and modifying the numbering, we can 
assume that the sequence y, itself converges as n -> co; let 
y = lim y,,. In virtue of the continuity of the norm, 


ly| = lim |y,| = M. 


n—-> oo 


1 
We claim that the vector z = ur Y is the maximal vector sought. 


In the first place, we have in virtue of the continuity of the 


operator A: 
eee a 


The vectors A «,,/M belong to the unit spherical surface, and hence 
the vectors A (A x,)/J1} do not exceed M in length. Applying 
lemma 1, we get 


[A?a,,| 2 —-|A a,|?> UM, 


Aw,\| 1 1 
\|- M 


GEOMETRY OF HILBERT SPACE 223 


“(Se) 
M 

i.e. z is the maximal vector of the operator A, as required. 

It is not a far cry from maximal vectors to characteristic vectors : 

Lemna 3. If ey is the maximal vector of a symmetric operator A, 
at is a characteristic vector of the operator A*® with characteristic value 
|p. 

Proof. By Lemma 1 and the definition of the norm of an operator, 
we have 


and it follows that 
|Az| = lim 


= WM, 


| Aj? = [A eg? S [AP eg] S| ATP, 

and hence 
| A e9|? = |A® eo] = | Al? 

By lemma 1, eg is a characteristic vector of the operator A? with 

the characteristic value 


A = |A eg)? = | Al?, 


as required. 

Lemna 4. If the operator A? possesses a characteristic vector with 
characteristic value M?, the operator A possesses a characteristic 
vector with characteristic value M or — M. 

Proof. The equation 4? e) = Me, can be written in the form 
(A — ME)(A + ME)e, = 0 (where EH is the identity operator). 
Let us suppose that z, = (4 + ME)e,+ 0. Then the condition 

(4d — ME)z, =0 
or what is the same thing. 
A 2 = M 29 


implies that z, is a characteristic vector of the operator A with the 
characteristic value = ||.4!. And if (A + M£E)e, = 0, then 


Ae = —-Ne, 
and we get that e, is a characteristic vector of the operator A with 
the characteristic value — WM = | Al. The lemma is proved. 


Lemmas 1-4 show that every symmetric, completely continuous 
operator A possesses a characteristic vector with characteristic value 
+||A|]. We shall now prove that a complete orthogonal system can 
be constructed in the space H from the characteristic vectors of 
such an operator A. We lead up to this construction with the follow- 
ing lemmas: 


8a* 


224 MATHEMATICAL ANALYSIS 


Lemma 5. The characteristic vectors of a symmetric operator corre- 
sponding to different characteristic values are mutually orthogonal. 
For let 
Auv=aAu, Ay=pmy 


with A =. We multiply the first equation scalarly by y, the 
second by xz, and subtract the second from the first: 


(A x, y) — (@, Ay) = (A — pw) (2, y)- 


The left-hand side of this equation vanishes on account of the 
symmetry of the operator. Since A= wu, we have (x,y) = 0, as 
required. 

Lemna 6. Every orthonormal system of characteristic vectors of a 
completely continuous operator A with characteristic values exceeding 
@ positive number 6 in modulus is finite. 

Proof. Let us suppose that an infinite system S of such charac. 
teristic vectors has been found. Each of them is mapped by the 
operator onto a scalar multiple of itself under multiplication by 
a number greater than 6. 

Let e;, e, be any two of these characteristic vectors: 


ley] = [ex] = 1, (Gee) =0, Ae =Ajpe, Ae =A em. 
We have 
[Ae — A ey)? = [Ape — Ape, 2 = A + Ap > 26%. 


This means that the distances between the vectors obtained by 
operating with A on the vectors of the system § will exceed 6 2. 
But then it is impossible to select a convergent sequence from the 
aggregate of such vectors, which contradicts the complete conti- 
nuity of the operator A. 

In particular, there exists only a finite number of mutually ortho- 
gonal vectors with a given characteristic value 4 = 0; in other words, 
every characteristic subspace that corresponds to a non-zero charac- 
teristic value of a symmetric, completely continuous operator A is 
finite-dimensional. 

This lemma allows us to draw definite conclusions in relation to 
the totality of characteristic vectors and characteristic values of 
the operator A. We consider the set of all characteristic values 
of the operator A, which is a subset of the real axis. By lemma 6 
there exists only a finite number of characteristic values the moduli 
of which exceed a given positive number 6, and so if the eharac- 


GEOMETRY OF HILBERT SPACE 225 


teristic values constitute an infinite set, they must form a sequence 
which converges to zero. We can therefore order them according 
to their decrease in absolute magnitude. We also agree to repeat 
each characteristic value a number of times equal to the dimen- 
sionality of the corresponding characteristic subspace. We can 
then map the sequence of all non-zero characteristic values. 


rey Pa ee 


of the operator onto the sequence of characteristic vectors 
Cy Cay 8hsy: Cas ae 


where A e, =A, €, (n = 1, 2, ...). We can assume that the vectors 
€,,¢g, ++ are mutually orthogonal and normalised. In fact if 
An + Am, then e, and e,, are orthogonal by lemma 5; and if 2, == An, 
we can always carry out an orthogonalisation in the finite-dimen- 
sional characteristic subspace corresponding to the characteristic 
value 2, = A,. The construction is completed by normalising all 
the vectors obtained. 

We shall show now that each vector z orthogonal to all the con- 
structed vectors €,, €g, .; Cn, --. 18 mapped by the operator A onto 
zero. To do this, we use the following lemma: 

Lemma 7. Let H' be a subspace of the Hilbert space H which is 
invariant relative to a symmetric operator A. Then the orthogonal 
conuplement H"” of the subspace H' is also invariant relative to the 
operator A. 

Proof. Let x be any vector of the subspace H’ and y any vector 
of the subspace H’’. By hypothesis (A x,y) = 0. But then in 
virtue of the symmetry of the operator, (2, A y) = 0. This means 
that the vector A y is orthogonal to any vector « € H' and conse- 
quently A y € H” for any y € H”, as required. 

We now consider the aggregate P of all vectors z orthogonal to 
each of the constructed vectors ¢€,, €g, ., €,, --- It is a closed 
subspace since it is the orthogonal complement of the linear enve- 
lope L(e,, eg, ---, €n, +) = Lf. Since the linear envelope L is evi- 
dently invariant relative to A, its orthogonal complement P is 
also invariant relative to A, by lemma 7. We denote by M (P) the 
exact upper bound of the values of |A a| on the unit spherical 


+ The linear envelope of a system of vectors 21, Xg, +... Ua, + is defined 
as the subspace comprising all their linear combinations. 


226 MATHEMATICAL ANALYSIS 


surface of the subspace O. By lemma 4 the subspace P contains a 
characteristic vector e, with the characteristic value A, = AI (P). 
But by construction P cannot contain a single characteristic vector 
with non-zero characteristic value. It follows that Aj = M(P) = 0, 
but this means that A z = 0 for any vector z € P, as asserted. 
We denote by L’ the closure of the linear envelope of the vectors 
€1, €g, -»; the orthogonal complement of this closure is also the 
subspace P. Each vector « € H can be represented in the form of 


a Sum e= a! + x", yg! € L', al! € P. 


Further, the vector x’ can be developed in a Fourier series with 
respect to the system ¢,, ¢, ..., €,, which is complete in the space 
L’; by what we have proved, the vector 7” is mapped by the ope- 
rator A onto zero. We have obtained the following fundamental 
theorem : 

THEOREM 2. Hach vector x of a complete Hilbert space H in which 
a symmetric, completely continuous operator A is given can be 
expressed in the form of an orthogonal sum 


fo 2) 
wae te! = See +2", 
fel! 
where €,, €), ... ore characteristic vectors of the operator A with non- 
zero characteristic values and Ax" = 0. 


Hilbert’s theorem is a corollary of this theorem. For in a sepa- 
rable space H the subspace P is also separable and it contains a 
complete orthogonal system ¢}, ¢), ..., @,, ..-; together with the 
vectors €1, €9, «+, €,, .. already constructed this yields a complete 
orthogonal system for the whole space H. Each vector of this 
system is a characteristic vector of the operator A: the vectors e,, 
with characteristic values A + 0 (m = 1, 2, ...) and the vectors ¢/, 
with the characteristic value 0. This completes the proof of Hil- 
bert’s theorem. 

Note. Vectors which belong to the range of an operator A, ie. 
vectors of the form 

p= 49, 
are said to be source-representable. We shall explain the significance 
of this term below (Section 5). 

Every source-representable vector y admits of a development in the 
characteristic vectors of the operator A with non-zero characteristic 
values. 


GEOMETRY OF HILBERT SPACE 227 
For by Hilbert’s theorem 


foe 
p= YS +2"; 
j=l 
where Ae, =/1,€, (An +0) and Aw’ =0. Operating on this 
equation with A, we get: 


Co 
yp=Ap = 2A bie: 


as asserted. 


4. INTEGRAL OPERATORS WITH SQUARE-SUMMABLE KERNELS 


1. We shall apply the theory expounded in Section 3 to Fred- 
holm’s integral operator 


A g(x) = f K(w,s) p(s) ds 


with a square-summable kernel K (x, s): 
bb 
f { K?@, s)dads = K? < &. (1) 
As we saw in art. 1 of Section 3 Fredholm’s operator is a bounded 
operator on the Hilbert space H = L,(a, 6) and has a norm equal 
to at most K. 
If the kernel K (x, s) is symmetric, i.e. if we have 


K(a, 8) = K(s, x) 
almost everywhere in the region G = {a <x <b,a <8 <b}, then 
Fredholm’s operator is also symmetric, i.e. (Ay, y) = (p Ay.) for 
any 7, p in D(a, b). 
For by Fubini’s theorem 


b b 
(Ag, y) = { | fK@, s) p(s) ask yen dx 
bb 


= f f K (x, s) p(s) p(a) ds da 


b b 
= fvto} [es pce) arkas 


a a 


b b 
= foot fxe 2) p(z) arhas = (p, Ay). 


228 MATHEMATICAL ANALYSIS 


The existence of the double integral 
bb 
f [ E@s) es) ye) ds dz, 
aa 


which is one of the conditions under which Fubini’s theorem is 
applicable, follows from the existence of the integral 
bb 
| [ Be, s) dads 
and 
b 


b 
f fee) y2(s)dads = f pe) da f y?(s) ds. 
aa a a 

We shall show now that Fredholm’s operator with a square-sum- 
mable kernel is completely continuous. We recall that a linear 
operator A is said to be completely continuous if a convergent sub- 
sequence can be extracted from each sequence 4 f,, for which the 
|fn] are bounded. In other words, the operator A is completely 
continuous if it maps any bounded subset of the space H onto a 
compact set (Chapter II, Section 7). For instance let A be a 
bounded operator that maps the space H onto a finite-dimensional 
subspace Q; we claim that A is then completely continuous. For 
the vectors A f, form a bounded set in the finite-dimensional sub- 
space Q, and in accordance with the results of Chapter IT, Section 7, 
such a set is compact, so that A satisfies the condition for complete 
continuity of an operator. 

If the function K (x, s) is of the form 


K(x, 8) -5 x (2) Ye (8); 


where (x), y,(S) (k = 1,2, ...,m) are functions the squares of 
which are integrable (such a kernel K (x, s) is said to be degenerate), 
then K (x, s) is bounded and 


b m me 
Ag(e) = f Pe) Pe (8) 28) ds aie ! f rulers) aah pr ($); 


i.e. the operator A maps the whole space L,(a, 6) onto the finite- 
dimensional subspace generated by the functions ,(), ..., Gm (%). 
Hence a Fredholm operator with a degenerate kernel is completely 
continuous. 


GEOMETRY OF HILBERT SPACE 229 


To deal with the general case we use the following lemma: 

Lemma. Let there be given in a Hilbert space H a sequence 
Ay, Ag, », Any «+, of linear operators which converges to an operator 
A in the sense that || A — A,|| > 0 as n > oo. If the operators An 
(n = 1, 2, ...) are completely continuous, then so is the limit opera- 
tor A. 

Proof. Let the vector f run through a bounded set B, say a 
sphere of radius r with centre at O; we have to show that the vec- 
tor A f/ then runs through a compact set. It is sufficient to establish 
that for any ¢ > 0 the set {A f} possesses a compact e-net. We 
find an operator A, in the sequence A,—- A such that 
| 4, — Al] < e/r. By hypothesis the set of elements {4, f} (f € B) 
is compact, and for any f we have |4,f - Af] <||A, — A| |f| < 
<e/r-r = e. It follows that the set {A, f} is a compact é-net for 
{A f}, as required. 

Now let K(x, s) be an arbitrary function the square of which is 
integrable over the region G@ = {a Sa Sb,a <s <b}. We claim 
that this function can be developed [in the metric of L,(G)] 
in a series of the form 


K(x =} amn €m (X en (8 ). (2) 


m,n=1 


We take as the functions e,,(2) any complete orthogonal system 
in the space L, (a, b). Then the products e,, (x) én(s) (m,n = 1, 2, ...) 
form a complete orthogonal system in the space L,(G). That the ere 
is orthogonal is obvious; we verify that it is complete. If the space 
L,(G) contained a function f(x, s) orthogonal to all the products 
€m(X) €, (8), so that 


f Jie ) én (a) en(s)dzds =0 (m,n =1,2, ...), 


then by Fubini’s theorem we should have for any fixed » 
b 


b 
i €n (2) {fre $) p(s) asl dz =0. 


It would then follow, since the system e,,(x) is complete, that for 
any n = 1, 2, ... and for almost all x 


b 
J f(x, s) en(s) ds = 0. 


230 MATHEMATICAL ANALYSIS 


This would imply, since the system e,,(s) is complete, that for al- 
most all «, s 
f(a, 8) = 0 


and f(x, s) would therefore be the zero element of the space L,(G). 
Thus the functions e,, (x) e,(s) really do form a complete ortho- 
gonal system in the space £,(G). But then by the fundamental 
theorem of Section 2 each element of the space L,(G) admits of 
a development (2), as asserted. 
The degenerate kernel, formed by the partial sums K),, (2, 8) 
of the series (2), 


P q 


King (x, s) = pe Di Ginn Em (x) en (s), 


m=line= 


determines a sequence of completely continuous operators 
b 
Ang G¢ (x) = } Kyq(x, 8) p(s) ds. 
a 


Using the bound for the norm of Fredholm’s operator (cf. p. 207), 
we get the inequality 
bb 


[4 — Apel? Sf f [K(@, 8) — Kpq(a, s)P dads, 


from which it follows that the operators A,, converge in the norm 
to the operator A as p > oo, g > &. 

Applying lemma 1, we conclude that the operator A is com- 
pletely continuous together with the operators A,,, as asserted. 

2. Thus a Fredholm operator with a symmetric square-summable 
kernel K(x, s) is symmetric and completely continuous. We can 
therefore apply Hilbert’s theorem (Section 3); in virtue of this 
theorem, there exists in the space L,(a, b) a complete orthonormal 
system composed of characteristic functions of Fredholm’s operator. 
We shall show that the squares of the characteristic values of Fred- 
holm’s operator form a convergent series. 

Let us consider the equation which determines the normalized 
characteristic functions of Fredholm’s operator: 


b . 
f K (a, 8) en(8) ds = Ay en (x); (1) 


a 


GEOMETRY OF HILBERT SPACE 231 


it shows that the quantity 2, ¢,(x) is a Fourier coefficient of the 
function K(x, s) (for a constant value of x). Hence, applying 
Bessel’s inequality (Section 2, art. 4) to (1), we get 


b N 
f K(w,s)ds > 3’ Az eda) (2) 


for every value of NV. Integrating this inequality with respect to x, 
we get 
bb N 
[ { K2(@, 8) dsda = YR 
; 0 


aa n= 


for any natural number JN. It follows that this series converges, 
as required. 

3. We consider Fredholm’s operator with a kernel which satisfies 
the Hilbert-Schmidt condition: 


b 
f K@,s)ds SC. (1) 


When this condition is satisfied, every function g(x) € L,(a, 6) 
is mapped by the operator A onto a bounded function, since 


b 2 
[A g(2)? = [ fete 8) p(s) as] 


b b b 
< | K%(w,s) ds [ g(s)ds <C [g(s)ds. (2) 


a 


In particular, cach characteristic function of the operator A with 
non-zero characteristic value is bounded. 
We shall call a function g(x) of the form 


b 
g(x) = [ K(x, 8) p(s)ds = Ag 


with an arbitrary function (x) € L.(a, b) source-representable in 
the kernel K(x, s), (in accordance with the general definition given 
at the end of Section 3). 

At the end of Section 3 it was shown that, when A is a symme- 
trie operator, every source-representable vector y = Ag admits 


232 MATHEMATICAL ANALYSIS 


of a development of the following form in the characteristic vectors 
of A: 


y= DS Alp, e) &- (3) 


We shall show that when condition (1) is satisfied the series (3) 
converges for any source-representable function not only in the 
norm, but also absolutely and uniformly. For by Cauchy’s ine- 
quality (cf. p. 192) we have for any m, 


{| (f, 6) A; este} > fe)? De 13, (x). (A) 


Since the sum /?, e% (x) + -- on the right-hand side of (4) is 
always bounded [in virtue of inequality (2) of art. 2] and the sum 
(f, €n) + ++ can be made arbitrarily small by taking n sufficiently 
large [by Bessel’s inequality (cf. p. 203)], the sum on the left- 
hand side of (4) can be made arbitrarily small similarly; hence 
the Fourier series under consideration converges absolutely and 
uniformly, as required. 

The property established here is sometimes called the Hilbert— 
Schmidt theorem. 

Note. If the kernel K(x, s) is continuous over the region 
asa 3b, as Xb, the corresponding Fredholm operator 
maps every function g(x) € L,(a, b) onto a continuous function. 
For if we put 


b 
p(x) = f Ke, 8) p(s) ds 


then for any 2’, x”, 
b 2 
pla") — pa") 2 =| f|K@',s) — Ke", s)| |p(o)| asl 
b b 
< f [K(«',s) — K(x", s)P ds f [p(s)P ds, 


and the continuity of the function p(x) follows immediately. Of 
course all the characteristic functions with non-zero characteristic 
values are then also continuous. 


GEOMETRY OF HILBERT SPACE 233 


4, Determination of Characteristic Functions and Characteristic 
Values 

The practical application of our results requires a knowledge of 
the system of characteristic functions corresponding to a Fred- 
holm operator. 

If the kernel K(x, s) of a Fredholm operator A is degenerate, 
so that 


Vari, = 3 pj (ze) 9; (8), 


then, as we have already seen, the operator A maps the whole 
space LZ, onto the finite-dimensional subspace generated by the 
functions p; (x) (j = 1, 2, ..., m). The characteristic functions with 
non-zero characteristic values should therefore be sought only in 
this subspace; they must be of the form 


e(x) =» c; pj (x). (1) 


To determine the coefficients c; we substitute the function (1) in 
the defining equation of the characteristic functions 


b 
f K(x, s) e(s)ds = Ae(a). 
Then we get: 


m 5 m 
Ae(x) = a Ae p; (x) = { K(, 8) 2 c; q;(s) ds 
j= a i 


m m b m m 
= SF DS pilx)g fals)g(s)ds = YS 6 aij vila), 
i=tj=1 a J=1tsl 
where 
b 
qi = { G(s) G(s) ds. 
Consequently 


yi Cj = 2 Vij C; Qj —4 1, 2, oeny m). (2) 


This system of equations allows A and the constants c; to be 
found in the usual way. It assumes a particularly simple form 
if p(x) = gi(x) and (p;, 9)) = 0 fort = 7. 


234 MATHEMATICAL ANALYSIS 


In this case the q;, vanish for ¢ + 7 and the system (2) has the 
obvious solution: c; = 1 for some j, ¢; = 0 for 1+ 7, A = q;;. By 
formula (1] the characteristic function corresponding to this 
solution coincides with the function p;(x) = 9;(x). Thus the cha- 
racteristic functions are the functions p; (x) (j = 1, 2, ..., m) them- 
selves and the characteristic values are the numbers q;;, i-e. the 
squares of the norms of these functions. 

If the kernel K(x, s) of the Fredholm operator A is non-de- 
generate, the following method of approximation is frequently 
useful in determining its characteristic functions and characteristic 
values. We replace the given kernel K(x, s) by a degenerate 
kernel K,,(v, s) which approximates it (for example, by a partial 
sum of its Fourier development) and find the characteristic func- 
tions and characteristic values of the corresponding operator A, 
by the method described above. Under certain assumptions regard- 
ing the smoothness of the kernel K (x, s) the characteristic values 
of the operator A,, obtained approximate those of the operator 4. 
We cannot deal at length with these questions here, but we recom- 
mend the reader to consult the special literaturey. 


Problems. 1. What are the characteristic functions of Fredholm’s integral 
operator with the kernel K(x, s) = cos (x + s) on the intervals (a) [0, 7], 
(b) [0, 27/2}? 

Answer. (a) cos x, sin x; (b) cos a -| sin a, cos x — sin a, 

2. Show that a square-summable symmetric function K(x, s) can be 
developed in a bilinear series which converges in the metric of £,(@): 


K(x, $8) = DA, Px (2) Px (8) (1) 
where the gy, (x) are the normalised characteristic functions and the A, the 
corresponding characteristic values of the integral operator K (with kernel 
K (x, 8)). 

Hae The products @,(«) y,(s) form a complete orthogonal system in 
L,(G). 

3. If with the conditions of problem 2 the function K(x, s) is developed 
in a series which converges in the metric of L(G): 


K(x, 8) a p2} by, Uy (x) Us, (8) ’ 
and the functions u,(s) are mutually orthogonal (in Z,(a, 6)) and normalised, 


then u,(x) is a characterised function of the operator & and mw, is the corre- 
sponding characteristic value. 


+ Cf. R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. 1, 
Chapter 3, Section9; L.V.Kantorovich and V.I. Krilov, Methods of 
Approximation in Higher Analysis, State Technical Publishing Dept. 1949, 
Chapter 2, Section 4. 


GEOMETRY OF HILBERT SPACE 235 


n n 
4. Analogous to the quadratic form 37 3» ayy §; Ex of finite rank is the 
j=l k=l 
integral quadratic form 


68 
(Kp, 9) = f f K(x, 8) p(x) p(s) dads. (2) 


The quadratic form (2) is said to be posttive—definite if (K yp, p) > 0 for any 
function g € L,(a, 6) not identically zero. Show that all the characteristic 
values of the Fredholm operator K corresponding to a given positive—definite 
quadratic form are positive. 


5. If the kernel K(x, s) of a positive-definite quadratic form (1) is sym- 
metric and continuous, then K(s, s) = 0. 
Hint. Suppose that K (sp, 89) < 0 and construct a function g(x) for which 


(K Yo, %) is <0. 


6. If the kernel K(x, s) corresponding to a positive—definite form (K g, ¢) 
is continuous and symmetric, the development (1) converges absolutely and 
uniformly (Mercer’s theorem). 

flint. Applying the result of problem 5 to the kernel 


K (8) — EA, pe(2) als) 


oe} 
we get the convergence of the series 5’ 2, yZ(s); use Cauchy’s inequality 
kel 


foe) 
to deduce that the series 2D 4, (x) y,(s) converges uniformly in each 


kel 
coordinate (with the other held fixed). Then use this result in conjunction 
with that of problem 3 to deduce that the sum of this series is K(z, s). 
Applying Dini’s theorem (Chapter II, Section 7, problem 4) to K(s, s) 
fo. 9] 
= JY A, p(s) and again using Cauchy’s inequality, obtain the uniform 


k=l 
convergence of the development (1) in the region G. 


7. Show that the operator A, given in the orthonormal basis {e,} by the 
matrix jj@,,|| in accordance with 


Cc 
Ae, = 2 Che 


is completely continuous if YY ajj, < 0. 
Hint. Express A as the limit of operators mapping the entire space on to 
finite-dimensional subspaces. 


8. If A is an integral operator in L,(a, 6) with a square-summable kernel 
and {e,(x)} is an orthonormal system in L,(a, 6), then 


3 AGP = SD (Ae, ¢,)? < 00. 
Hint. Use the method of art. 2. 


236 MATHEMATICAL ANALYSIS 


9. If an operator A is given in Z,(a, 6) in the orthonormal basis {e,(a)} 
by the matrix ||a,,!| with 2’ a;3, < oo, a square-summable kernel K(z, 8) 
exists such that A y = { K(x, 8) y(s) ds. 

Hint. Put K(x, 8) = LY (A @, e) 6 (%) e,(8). 

Note, The results of problems 8 and 9 show that, among the completely 
continuous operators that act in space L,(a,b), the Fredhold integral 
operators are distinguished by the condition that the sum of the squares 
of all the matrix elements in any orthogonal basis of the space is finite. 


5. Tre Sturm—LIovuviLtLe PRrospLuemM 


1. The general theorem on integral operators with symmetric 
kernels has many applications in mathematical physics. One of 
the most important of these is the solution of the Sturm—Liouville 
problem. 

Consider the differential operator 


S[u] = (p(x) u’(x))" ~ q(x) wa) (1) 
(p(w) € Dy (a, 6), q(x) € C(a, b)), 


defined on the closed interval [a, 6] for twice differentiable func- 
tions u(x) subject to certain homogeneous boundary conditions, 
for example u(a) = u(b) = 0. 

The operator S is essentially different from the operators we 
have so far considered; it is not a bounded operator, nor is it 
defined on the whole space L,(a, b). Nevertheless, it is symmetric 
on its domain of definition, ie. for any two twice differentiable 
functions u, v which satisfy the prescribed boundary conditions, 
the equation 

(Su, v) = (u, 8 v) (2) 
holds. For 
h 
(S u,v) = f [(p u')’ — qulvda 
b 


b 
=puiv — [(pu'o' +quv)da, 


a 


b 
(u, Sv) = ful(pe'y' —gqv]dx 


o 


b 
— f(pu'o' + quv)dzx,. 


a a ) 


=upy' 


GEOMETRY OF HILBERT SPACE 237 


and in virtue of the boundary conditions, 
b 


p(u'v —uv')| = 0, 


a 


and consequently (2) holds. 

A function e(x) is said to be a characteristic function of the 
operator S if it is contained in the domain of S (ie. if it is twice 
differentiable and satisfies the prescribed boundary conditions) 
and satisfies the equation 

Se=he. 


By the symmetry of S, just as in Section 3, its characteristic 
functions corresponding to distinct values of A are mutually ortho- 
gonal in the space L, (a, 5). 

We wish to prove that the characteristic functions of the operator 
8S form a complete system in L, (a, b). This problem, which incorpo- 
rates first the question of the existence of an infinite set of charac- 
teristic functions, and secondly the question of the completeness 
of such a system, is called the Sturm—Liouville problem. 

2, There is an important problem in which the necessity of 
solving the Sturm—Liouville problem becomes apparent. 

Let us consider the vibrations of a non-uniform string fixed at 
the points « = a, « = 6. As we know from Chapter III, these 
vibrations are described by the equation 


(p Ure = Uy (1) 


where (a) is the modulus of elasticity and s(x) is the density of 
the string; the required solution u(x, ¢) must satisfy the boundary 
conditions 

u(a, t) = u(b, t) =0 (2) 


and the initial conditions 
u(x, 0) = q(x), w(x, 0) = p(x). (3) 


We shall look for a solution of (1) subject to the condition (2) in 
the form 
u(av, t) = X(v) T(t). (4) 
Substituting (4) in (1) and separating the variables, we find: 
(p Xx’) qT" 


ox oT ©) 


238 MATHEMATICAL ANALYSIS 


where primes denote differentiation with respect to the correspond- 
ing variable. The right-hand side of (5) is independent of x and 
the left, independent of ¢, which means that the ratio (5) is constant. 
Denoting its value by A, we get the two equations 


(pX'Y =AuX, (6) 
T' =AT, 
If w(x) = 1, the function X (x) must be a characteristic function 


of the Sturm—Liouville equation (6). 
If u(x) + 1, we can make the substitution 


2=Xu, 
when equation (6) will reduce to the form 
(p, 2')' —~qz= dz, (7) 


where 


moe ala) 


and the function z(x) again turns out to be a characteristic func- 
tion of the Sturm—Liouville equation (7). For the sake of simplicity 
we shall assume in what follows that u(x) = 1. 

Let us suppose that the corresponding Sturm—Liouville problem 
has a positive solution: there exists a complete orthogonal system 
of functions ¢,(x), ..., &,(%), ... satisfying the equations 


(p(@) &n(%))’ = An en (@) 


and the boundary conditions e,(a) = e,(b) = 0. If p(w) > 0, the 
4, are clearly negative, since 

b b 

— | plen(x)Pde <0. 


a a 


b b 

in f eR(a) de = f (pen)! en dv = penn 

a a 
The equation 
TT An T 
evidently has a solution of the form 
T, = A, cos v,¢ + B, sin vp E, 

where »? = —/, and A,, B, are arbitrary constants. Equation (1) 
has a set of solutions of the form 


Un (x, £) = en (x) (A, cos v, § + B, sin v, f), 


GEOMETRY OF HILBERT SPACE 239 


which represent pure vibrations of frequencies »,, 7,, .... The 
numbers 7,, %), ... are said to be natural frequencies of the string 
given by conditions (1)-(3). A solution u(x,t) which will also 
satisfy the initial conditions (3) can now be obtained by combining 
the solutions u,, (x, ¢) already found: 


CO 
u(x, t) = » Uy (2, t). 
n=l 
To determine the coefficients A, , B, we have the conditions 


u(a, 0) =, yp (x) = s A, €, (x), 
n=1 


u,(e, 0) = p(2) = 2 Vp, By ey (2X). 

The functions g(x), p(x) have developments in the e,, (2) since the 
e,(«) are assumed to form a complete system and the coefficients 
A,,, B, can therefore be found. They are evidently determined 
directly by the coefficients of the developments of », y. Thus in 
principle the problem of the vibrations of a non-uniform string 
has been solvedf. 

3. To tackle the solution of the general Sturm—Liouville problem, 
let us suppose that the equation 


Su=(pw) —qu=0 


has no solution in the domain of the operator S (i.e. among the 
twice differentiable functions with the fixed boundary conditions). 
The operator S is then said to be non-singular. 

We shall show that a non-singular operator S has an inverse A, 
which is a Fredholm integral operator with a continuous symmetric 
kernel. 

The phrase “A is the inverse operator of the operator S”’ means 
the following: 

(1) For any function w(x) in the domain of S we have 


A(Su) =u. 
(2) For any continuous function p(x) the function A p(x) belongs 


to the domain of S and 
S(A yp) =. 


+ We leave aside questions as to the convergence of the series obtained 
and the sense in which the function u(z, ¢) is a solution of equation (1). 


240 MATHEMATICAL ANALYSIS 


The existence of an operator A with the specified properties 
solves the problem. For as a symmetric Fredholm operator, the 
operator A possesses an orthogonal system of characteristic 
functions e,(x), ..., @,(x), ... with non-zero characteristic values 
Ayy v2 4ny +, «+» The functions e,(z) are all continuous in virtue 
of the concluding results of Section 4. Operating with S on the 
equation 


1 
en = a, A en 
we get 
1 
Ss en - SA en = An Cn, 


so that e, (2) is a characteristic function of S with the characteristic 
value 1/4,,. We shall show that the functions e,(x) (n = 1, 2, ...) 
form a complete system in the space L,(a, b) (and that conse- 
quently the operator A has no characteristic functions correspond- 
ing to the characteristic value zero). For any function u in the 
domain of S, we have 


A(Su) = 4, 


which shows that uw belongs to the range of the operator A. But 
then uw can be developed in the characteristic functions of A with 
non-zero characteristic values, i.e. in the functions e, (x), é9(z), ..., 
€n(X), 

The functions u(x) evidently constitute a set which is dense 
in the space L,(a, 6). It follows that the functions e,,(2) form a 
complete systems in L,(a, 6), as required. 

4, Thus the solution of the Sturm—Liouville problem reduces to 
the construction of an inverse operator A for the operator S in 
the form of a Fredholm integral operator with a symmetric kernel. 

Let w(x) be an arbitrary function in the domain of S, ie. a 
function which satisfies the boundary conditions and has two 
continuous derivatives, and let Su = y. We have to reconstruct 
u from y. We observe that the domain of operator S can contain 
only one function u which satisfies the equation Su = y. For if 
there were two possible solutions to such an equation, their differ- 
ence v would satisfy the homogeneous equation S v = 0 and would 
also lie in the domain of S, when it would follow by hypothesis 
that v = 0. 


GEOMETRY OF HILBERT SPACE 241 


So we must solve the equation 
Su=y, (1) 
where p(x) is a given continuous function. We apply the usual 
method of undetermined coefficients. Let u(x). we(v) be two 
linearly independent solutions of the equation 
Su=(pw'y—qu=0, (2) 
and for definiteness let u,(x) vanish at x = b, and u,(x) atv =a. 
We shall seek a solution of equation (2) in the form 
u(x) = Cy (x) uy (w) + C2 (x) ua(a), (3) 
where C(x), C,(x) are some undetermined (once) differentiable 
functions. Differentiating (3), we find 
ul (a) = Clu + Cpu, + Cy uy + Ce ug. 
As usual we impose on the functions C,(x), C,(x) the condition 


Cy ty + Cyu, = 0, (4) 
so that 

ul = C, uy + Cy ug. (5) 
Differentiating again and substituting in (1), we get 

Crp uy + Cy pig = y- (6) 
Cy and Cy can now be found from equations (4) and (6). Their 
discriminant 
(Uy My — Uy Wy) 

is actually independent of x. Infact, by a well-known theorem of 
Liouville the Wronskian of the equation S u = 0 can be expressed 
in the form 


io oe ig 2 aprataial 
Wix) = Wiaje * = Wia)e oa eee ed kc 
(x) (a) (a) ye 


so that 
P(Uy Uy — Uz Uy) = p(x) W(x) = W(a) p(@) = cy = const, 


as required. 
Solving the system (4), (6), we get 


sn ee Cz = Gist ag 


Co lg 


242 MATHEMATICAL ANALYSIS 


We then take the primitives C,(«), C,(x) in the following form: 
x b 
1 1 
x(x) = — f uglé) pds, Cy(x) = — f mle) we) de: 
0 0 


this choice of the primitives entails that the required solution 


Cite On, =) fa wteydg + fuse) ple) de 


satisfies both boundary conditions w(a) = u(b) = 0. 
Thus we have obtained the following result: if we have Su = p 
for some function w(x) in the domain of the ee S, then 


Ho) = 8 Pa e)dé + 28 Facey (£) dé, 


or, in other words, 


where 


(8) 


The function K(x, €) is clearly symmetric and continuous for 
asxé<b,a <u <b. We denote 


b 
Ay = f K(x, é)y(é) dé. (9) 


Then for any function u(x) in the domain of S we have A(S u) 
= u, so that the first of the two conditions implicit in “A is the 
inverse operator of S”’ is satisfied. We verify the second condition, 
Let y(x) be an arbitrary continuous function on the closed interval 
[a, b]; we shall show that u = A w belongs to the domain of S and 
that the equation 

SAyp=y (10) 


is satisfied. 


GEOMETRY OF HILBERT SPACE 243 


On substituting the expression for A (wv, s) from (8) into (9), 


we get 
x b 


Oe “AD fug(é) we) a8 z. “AED Fay (6) ple) aE. (11) 


a xz 


This function is obviously continuous and vanishes at x = a, 
x == b; in addition, it has a continuous derivative: 


hr 


wa) = “BED fug(e) ple) dg + AE? nate) (a) + 
. a : 
+ BE) fue) we) de — 2 w,(e) wee) (12) 


b 


= AO) fulerv@) dé + i fue) ple) dé. 


ae 


It is clear from (12) that u(a) also has a second derivative: 


wa) = ED Paley y(erdg + 8S uate) ple) + 


b 


ee). il uy (€) p(E) dé — ae uy (x) p(@) (13) 
0 


av 


x b 
ud ‘yt 1] 
= AEE fualerwteras 6 “2D fuser (e)ag + WO)y@). 


a b 


On multiplying (13) by p(x), (12) by p’ (x), (11) by —q(x) and 
adding, we get 


z 


sy eee TE a4 fug(e) w(E) dg + 
2) 


b 
Ww a = 1 
re PEE 8% fu, yldé + PW yla) = le), 
0 0 


a 


244 MATHEMATICAL ANALYSIS 


or 
Su=SAyp=y, 

as required. 

We now have a full solution of the Sturm—Liouville problem in 
the non-singular case. 

The function K(z,s) which we have constructed is called 
Green’s function for the boundary problem under consideration. 

Note. A definite physical significance can be attached to Green’s 
function. We interpret the equation 


(pu')’ — qu = y(z) 
as the condition for equilibrium of a non-uniform string under the 
action of a steady force of line-density p(x). As we have shown, 


the required equilibrium configuration can be described in the form 
of an integral 


b 
u(x) = [ K(w, §) ple) ae. 


Now let y(x) vanish everywhere except on an interval of length 
2h, centre &, where it takes on the constant value 1/2. Such a 
function p(x) is the line-density of a unit force distributed uniformly 
along the interval [&, — h, &, + h]. The equilibrium configuration 
of the string is given in this case by the function 


Eth 


1 
op [ Koneas. 
fo-h 


In the limit as h > 0 this expression tends to K (x, &)). We can 
therefore say that the function K(x, &)) describes the equilibrium 
configuration of a string to which is applied a wnit force concentrated 
at the single point &,. - 

Conversely, if we know for any ¢ the equilibrium configuration 
K (x, €) of the string under the action of a unit force concentrated 
at the point €, we naturally assume that the corresponding con- 
figuration under the action of a continuously distributed force of 
line-density w(x) is given by the integral 


a 


[Ke €) ple) de, 


7) 


GEOMETRY OF HILBERT SPACE 245 


This consideration could equally well be taken as a starting- 
point for the solution of the Sturm—Liouville problem +. 

A similar situation often obtains with other physical models. 
For instance let K(x, &) denote the steady-state temperature of 
a diathermal bar occupying the closed interval [a,b] under the 
agency of a heat source of unit power situated at the point &; 
then the steady-state temperature when the source is continuously 
distributed with line-density p(&) will be given by the integral 


b 
u(x) = [ K(x, &) p(é) dé. (9) 


It is for this reason that functions of the form (9) are called 
*“source-representable’’. 

5. Let us consider the special case when there exists a function 
u(x) + 0 which satisfies the boundary conditions and the equation 
su = 0. 

This means that the operator S possesses a characteristic value 
4 = 0. Since S can have only a countable set of characteristic 
values (we recall that the characteristic functions of a symmetric 
operator S which correspond to distinct characteristic values are 
orthogonal), there exists a number A, which is not a characteristic 
value of S. Then all our arguments can be applied to the operator 
S, = S —/,F, which has the same structure as S but is non- 
singular. The operator S, has an inverse operator A,—an integral 
operator with a symmetric kernel—and therefore possesses a com- 


plete system of characteristic functions e,(z), ..., é,(x), ... with 
characteristic values A,, ...,2,, -.., it follows that S also has a 
complete system of characteristic functions, viz. the same ones 
€,(v), ., @,(@), .. with characteristic values A, + dy, -.., 


An + 49, --» Thus the Sturm-Liouville problem is also soluble in 
the singular case. 


Problem. Construct Green’s function K(x, s) for the differential operators 
SU = Uy, 
with the boundary conditions: (1) u(0) = u(z) = 0; (2) «(0) = u'(@) = 0. 


Write down the developments of these functions in bilinear series (Section 4, 
problem 2). 


+ Cf. R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. 1, 
Chapter 5, Section 14. 


MA. 9 


246 MATHEMATICAL ANALYSIS 


Answer. 


u(t — §) fora <é&, 
Ky(x, €) = ie —2x)é for x > &, 


2 © gin nx sin 2s 


K,{z, s) = min (2, 8), 


Ky(x, 8) ee 


> 
% nai n2 


De oes al 
sin (n+ “sm nts 8 


2 J = } 2 
n=l (» +5] 


6. NoN-HOMOGENEOUS INTEGRAL Equations WITH 
SYMMETRIC KERNELS 


1. In this section we shall consider the equation 
b 
p(x) = f(x) + [ K(w,s)(s) ds, (1) 
a 


where f(x) and K (x, s) are given, and (x) is the function required. 
The kernel K (z, s) is assumed to be symmetric and square-sum- 
mable. 

In abstract space the analogous equation is of the form 


p=f[+Ag, (2) 


where A is a completely continuous symmetric operator. 

Let us suppose to begin with that a solution y of equation (2) 
exists. 

Projecting both sides of (2) onto the line determined by the 
characteristic vector e, (where A e, = A; e,), we get 


(Y, ep) = (f, ex) ag (A ?, en) = (f, 7) a (y, A ex) 


= (f, ex) + (Ps An Ce) = (fr re) + An (Ms en); (3) 
and hence for 2,. == 1 we find 
Pe (f, ex) 
(P; €x) = 1 ae (4) 


Thus, provided the number 1 is not a characteristic value of the 
operator, all the Fourier coefficients, of the solution qm are well- 
defined. In this case there can only be one solution, namely 


oy fea (5) 


GEOMETRY OF HILBERT SPACE 247 


Let us see that the series (5) does actually constitute a solution 
of (2). We observe first that the series is convergent in the norm 
since the squares of its coefficients evidently form a convergent 
seriest. We denote its sum by y~. Then 

Ay ere (f, €;.) 


A Oi — Sh ee) ee + SF et ee 


so that the vector ¢ really does satisfy equation (2). 

Let us consider now the case when the set of characteristic 
values of the operator A contains the number 1. Hf 4, = 1 and 
(f, &.) 2 0, equation (8) yields a contradiction and (2) has no 
solution. But if when 7,. = 1 we also have (f, e,,) = 0, then there is 
no contradiction, but equation (3) imposes no condition at all on 
the unknown coefficient (9, ¢,,). 

We claim that in this case the series (5) gives a solution of (2), 
in which for 4). = 1 we can take any value for (f, e,)/1 — 4). For 
let us put » = g’ + @’, where 


: 2) (otto, (A F 1), 
gp" = En eh (A, = 1). 


By hypothesis, the vector f is orthogonal to the vectors e, for 
which /,, = 1 and therefore has a development in the vectors e, 
of the first group. Applying the theorem proved above to the sub- 
space generated by these vectors, we get 


Ag =9 —]. 
Using the condition 4, = 1, we find further: 
Ag! = SoA ee = DS) En ee = Pp". 
If follows that 
Ag=dg t+ Aap =p —-f+e =—9-f, 


as required. 
We have obtained the following result: 
THEoREM: If the number 1 is not included amongst the 4),, then 


the equation 
p=f+Ap (6) 


1 
+ The quantities Teal have a common bound since 2, > 0 (Section 3). 
Ak 


g* 


248 MATHEMATICAL ANALYSIS 


has a unique solution for any f. If the number 1 does occur in the A,, 
then a solution of (6) exists only for vectors f which are orthogonal to 
the corresponding characteristic subspace of the operator A; moreover 
the solution is determined only up to an arbitrary term of this sub- 
space. 


Problems. 1. Solve the equations: 
2 


(a) pile) = 3 f asp, (8) ds -+ 8x — 2. 
6 
9 
Answer pile) = ae 2, 
a 
(b) galt) = 8 | asp, (8) ds 4+- 8x — 2. 


0 
Answer g2(v) = Cx — 2 (C arbitrary), 
1 
(c) a(t) = { (x ++ 8) @g (x) dx + 182% — Ox — 4, 


Answer pala’) = 18a? +- 122 + 9, 


(d) pala) =f cos (e+ 8) ga (s) do + 1. 
0 
Answer g(x) = 1— gs ane : 
30 


Hint. Use the method of Section 4, art. 4. 


2. An integral equation in which the unknown function :~(x) enters only 
under the integral sign 


b 
f K(x, 8) y (s) ds = f (x), 


is sometimes termed an equation of the first kind (in distinction to equations 
of the second kind such as were considered above where the unknown func- 
tion figures separately). Show that an equation of the first kind (in the case 


of a square-summable symmetric kernel) has a solution in the space L.(a, 6) 
oo ¢2 

if and only if the series 2’ rol converges, where the ¢, are the coefficients 
n=2h, 

of the development of f(x) in the characteristic functions of the kernel 


K(x, 8) and the 4, are the corresponding characteristic values. 


2. By way of example, an equation of the form (1) arises in 
solving the problem of forced vibrations of a non-uniform string 


GEOMETRY OF HILBERT SPACE 249 


fixed at its end-points under the action of a periodic force 
g(x, t) = g(x) cosa t. 
These vibrations are described by the equation 
(p(x) Ux)y te ue) Un + g (x) cos @ t, (1) 
where p(x), 4(x) are physical characteristics of the string and g(x) 
is the external force per unit Jength of string. 
We shall look for a particular solution of equation (1) in the 
form of the product 
u(x, t) = w(x) cost, (2) 
where o(x) is some twice differentiable function that vanishes for 
x=aand x = b. Substituting (2) in (1), we get for the function 
p(x) 
(py) + o up = g(x). (3) 
Operating with A on both sides of the equation, A being the in- 
verse operator of S = (pq’)’, we get 


p+ oA(ug) = 


Recalling that A is Fredholm’s operator with a symmetric kernel 
K(x, 8), we arrive at the eipanen 


g(x) = f(x) ~ot f K (x, 8) (8) p(s) ds, (4) 


where f(x) = A g(x) = f K (a, s) g(s) ds is a known function. If 


H(s) 1, the Fredholm Spe has a non-symmetric kernel. In 
this case, the substitution y (x (x) Yule) reduces the equation 


te one with the symmetric < aw K(x, s) ) Vu (x) uw(s). For 
simplicity we shall assume that u(s) = 1. 

The condition for equation (4) to have a solution for any f(x) 
is that the kernel — wm? K(x, s) should not be associated with the 
characteristic value 1, or what is the same thing, that the Sturm— 
Liouville kernel of S, the inverse operator of 4, should lack the 
characteristic value 1 = —w*. We observe that the frequencies 
of the characteristic vibrations of the string were determined by 
the condition 4, = — v2 (Section 5, art. 2). Thus the condition for 
equation (4) to have a solution for any function f(x) is that the 
frequency w should not coincide with any natural frequency », 
of the string. The external force must not be in resonance with 
the natural frequencies of the string. 


250 MATHEMATICAL ANALYSIS 


Tf the frequency w of the forced vibration does coincide with 
one of the natural frequencies of the string, the condition for the 
problem to be soluble is that f(x) should be orthogonal to the 
corresponding characteristic function e¢, (x), ie. (f, e.) = (Ag, ex) 
= 0. Since the operator A is symmetric, this condition quickly 
reduces to one of orthogonality between the function g(x) and the 
characteristic function e, (2): 

(A q; ep.) = 9, A x) = 9, hi. €,) = dn (9, ex) = 0. 

The solution obtained for the integral equation (4) allows us to 

construct a particular solution of (3). Any solution of (3) can be 


obtained by adding to the solution found some solution of the 
homogeneous equation 


(p Ur) x = Ut 
These considerations enable us also to obtain a solution of 
equation (1) in the case when we have initial conditions. 


7. NON-HOMOGENEOUS INTEGRAL Equations WITH 
ARBITRARY KERNELS 


1. Let us consider the integral equation 
b 
p(x) — { K(x, 8) p(s) ds = f(z), (1) 


where the kernel K(x, s) is square-summable but not in general 
symmetric. The function (f(x) is assumed to belong to the space 
(a, b) and it is in this space that we seek the unknown function 
p(x). 

For { (x) = 0 we get a homogeneous equation in which we denote 
the unknown function by (2): 


b 
go(x) — f K(@, 8) po(s) ds = 0. (19) 


It seems natural to consider alongside equations (1,), (1,) the 
“allied”? equations with kernel K(s, x) distinguished from the 
original kernel K (x, s) by a transposition of the arguments: 


b 
p(x) — [ K(s, 2) p(s) ds = g(2), (1s) 


a 


b 
yo(e) — f K(s, x) yo(s) ds = 0. (14) 


GEOMETRY OF HILBERT SPACE 251 


The following fundamental theorem establishes a connection be- 
tween the solution of equations (1,), (1,), (13), (14). 
We observe first that only two cases are logically possible: 
(a) equation (1,) has the unique solution o(x) = 0; 
(b) equation (1,) has a solution yo(x) + 0. 


THEOREM |. (E. Fredholm, 1903). In case (a) equation (1,) has 
a solution for any f(x) € L, which is, moreover, unique; equation (1,4) 
has the unique solution wo(x)=0; equation (15) has a unique 
solution for any g(x) € Lg. 

In case (b) the number of linearly independent solutions of equation 
{1,) is finite; we denote it by v. Equation (1,) has the same number of 
linearly independent solutions. Equation (1,) has a solution if and 
only if the function f(x) is orthogonal to all the » solutions of (14); 
this solution is determined not uniquely, but up to a term which is a 
solution of (1,); among the solutions of equation (1,) there exists one 
and only one which is orthogonal to all the solutions of (1,). Similar 
assertions hold for the solutions of equation (15). 

2, Let us consider first of all the analogue of theorem 1 for the 
case of a linear system of algebraic equations. 


m 
Pa Qi; &; = b; (¢ => 1, 2, eesy m). (2,) 
J 
We write down the corresponding homogeneous system 
d 4569 = 0 (22) 
j=l 
and the allied systems 
Dy YiNj = Ci, (23) 
jel 
Sui =0, (24) 
jz 


for which the matrix is obtained by transposing the matrix of the 
systems (2,) and (2,). We examine the assertions that parallel those 
of Fredholm’s theorem. 


(a) Let us suppose that the system (2,) has only the null solution. 
As we know from algebra, this means that the rank of the matrix 
A = |a;;|| is equal to the number m, i.e. det A + 0. Hence system 
(2,) has a solution for any b;. The system (2,) has the determinant 


252 MATHEMATICAL ANALYSIS 


det ||; ;| = det |@;;| which is therefore also non-zero; hence 
system (23) possesses a solution for any c;, which is moreover 
unique; in particular, for c; = Oit has the unique solution 7° = 0. 
Thus in case (a) all the assertions that constitute the analogue to 
Fredholm’s theorem are verified. 


(b) We suppose now that the system (2,) has a non-trivial 
solution £°. This means that the rank r of the matrix 4 is less 
than m. The number » of linearly independent solutions of the 
system (2,) is equal to m — r. Since the rank of a matrix is in- 
variant under transposition, the number of linearly independent 
solutions of the system (2,) is also equal to m — r = v. The system 
(2,) no longer has a solution for arbitrary b;. To ascertain what 
conditions must be imposed on the 6; to ensure that the system (2,) 
should have a solution, we interpret the system geometrically, 
regarding the aggregate of numbers (&,, ..., &,,) as a vector in 
the m-dimensional Euclidean space f,,,. The existence of a solution 
of the system (2,) is equivalent to asserting that the vector 


b = (b,, by, ..., b,,) is contained in the linear envelope L of the 
vectors @, = (@44, @o1, ++) Bnz), A, = (Ayo, Gag, «+, Ang), os 
Om = (44m; 4am: +> 4m). Lf Z is the orthogonal complement of 


this envelope, the interpretation can be expressed as follows: 
the system (2,) has a solution if and only if the vector 6 is ortho- 
gonal to the subspace Z. The condition that a vector 7° should 
belong to the subspace Z is formulated in system (2,). It follows 
that system (2,) has a solution if and only if the vector 6 is ortho- 
gonal to any solution of system (2,). Further, in the case considered 
the system (2,) has a whole complex of solutions which in geometri- 
cal terms constitute a hyperplane parallel to the solution subspace 
of system (2,). The perpendicular dropped from the origin of co- 
ordinates onto this hyperplane identifies uniquely that solution 
which is orthogonal to all the solutions of the system (2,). Thus 
we have also verified for case (b) the analogues of the assertions 
of Fredholm’s theorem. 

3. Returning now to the integral equations, let us begin by 
considering the equations with degenerate kernels: 


K(a,s) = 3 Du (ee) a.(8), 


Fe 


K (8,2) = PA Pr (8) i. (X)- 


GEOMETRY OF HILBERT SPACE 253 


We can assume that the functions p,(x), and likewise the q,(s), 
are linearly independent. Equations (1,)-(1,) acquire the form 


m b 
— 3 Pel@) f dels) pls) ds = f(2), (3,) 
in b 
— 37 Pele) f G(s) Pols) ds = 0, (3,) 
= b 
ve) 3 axle) J px (8) p(s) ds = g(x), (3s) 
Po (a) — Sah fmt 8) yo(s) ds = 0. (34) 
These equations can be expressed in the abstract form 
P — Pen» ~) =F; (41) 
— 3) Px (> Po) = 9, (45) 
y-— S%(Pe Y) = 9, (45) 
— D1 (Prs Yo) = 9, (44) 
where the vectors 9, f, D., 9%, .- belong to some Euclidean 


space Lf. 

The operators involved in these equations belong to the class of 
degenerate operators; an operator is said to be degenerate if it is 
defined by an equation of the form 


Boy = Dd) Pr: P)« 
kal 


It is evident that the degenerate operator B maps the whole space 
onto the finite-dimensional subspace generated by the vectors 


1, Pog, +, Pm. It is clear from equation (4,) that if it possesses a 
solution it will be of the form 
p=l+ SEP: (5,) 


where the &, are certain unknown coefficients. Similarly the 
solutions of the remaining equations will be of the form 


Po = Dyer Pk» (59) 
p= Gt Dim (53) 
Yo = DINE Ue (54) 


254 MATHEMATICAL ANALYSIS 


Substituting (5,) in (4,) we find that the numbers ¢, must satisfy 
the equation 


nL ne 


Pi En Pr — 2 PR (Qn f) ~~ 2 Pr (a, Py gi ») rn 


= 


or, (since the vectors p, are linearly independent), the system 


m 


gy, — 2 Ei(Di, Ye) = (i, I) (ke = 1,2, ..., nv). 


With the notation (p;, %) = (t+ &), 1 — (pis di) = Gis 
(f, %) = b,, we reduce this system to the form 


Sai & = 6; (¢ = 1,2, ..., m). (6,) 

jx 

Equations (4,)—(4,) reduce similarly to systems of the form 
Sous} =0, (6,) 
3 ay = 6; (65) 
Saya = 0, (64) 


respectively, where ¢; = (g, p;). If a solution exists for any one 
of the systems (6,)-(6,), a solution of the corresponding member 
of the set (4,)-(4,), or what is the same thing, a solution of the 
corresponding equation in (3,)—(3,) can be constructed using the 
corresponding formula in the group (5,)—(5,). 

But as we have seen, all the assertions paralleling the Fredholm 
theorem hold for the systems (6,)-(6,). They will therefore be 
valid for equations (3,)-(3,) also. We need only verify that the 
scalar product, say of the vector / and the solution y) of equation 
(4,), coincides in the sense of the metric of the abstract Euclidean 
space # with the scalar product of the vector b with coordinates 
b, = (f, %) and the vector 7° = (7°, 7°, ..., 4°) of the finite- 
dimensional Euclidean space &,,,. This verification is accomplished 
by the simple computation: 


ue ae Hk 


(hve) = (fo Sak ae) = Sn (fea) = 3 Penk = 0.1%). 
heal h=1 k=1 


We have thus established the Fredholm theorem in the case of a 
degenerate kernel K (x, s). 


GEOMETRY OF HILBERT SPACE 255 


4. We now consider the general case. Let K (a, s) be an arbitrary 
function, square-summable over the region a <2,s <b. As was 
shown in Section 4, the integral operator 

b 
Agp= | K(x, s) p(s) ds 
a 
can be represented as the limit (in the sense of the norm) of integral 


operators 
b 


A, Pp = f K,, (2, Ss) p(s) ds 
with degenerate kernels K,,(x, 5). It is evident that the adjoint 


integral operator 
b 


A*y = { K(s, x) p(s) ds 
is then the limit of the integral operators 
Aky = [K, (s, 2) p(s) ds, 
the kernels of which are also degenerate. 
The adjoint operator A* is related to A by the equation 


(A* p,q) = (p, A 9) (7) 
for any vectors p, g. To see this, we observe that 


b b 
(4* p,q) = | | [ (Ke, 2) pleas ate) dx, 


a b 


h b 
(p, Ag) = | pa} | K(x, 8) ¢(s) ashe. 
The first of these integrals is transformed into the second by inter- 
changing the variables x, s and reversing the order of integration, 
a procedure which is valid in the general case in virtue of Fubini’s 
theorem. 

Equations (1,)—(1,) can be written in abstract form, regarding 
the vectors @, f, ... as elements of some Euclidean space EF: 


p-Agp=f, (8;) 
Po Ag = 9, (82) 
yp A*y=y, (83) 
Yo — A* yy = 9. (84) 


266 MATHEMATICAL ANALYSIS 


The solutions of equations of the form (8,) are the characteristic 
vectors of the corresponding operators with characteristic value 1. 
For brevity we shall simply call them the characteristic vectors. 
We denote by A*, A, the operators corresponding to the degene- 
rate kernels K,,(x,s), K,(s, 2), and consider the homogeneous 
equation 

Gn — Ange = 9. (92) 


Lemma 1. If for each 4 equation (92) has a solution ¢® + 0, 
equation (8,) will also have a non-zero solution. 

Proof. We can always take the solution ¢ of (9,) to be normalised, 
so that |\¢°| = 1. Since the operator A is completely continuous, 
the sequence A g® contains a convergent subsequence; discarding 
unwanted terms and renumbering, we can regard the sequence 
A @ itself as convergent. Then A, ¢? also converges, since 


and 
(An — A) G8 | <|An — All lee +9. 


The sequence ¢2 = A, gy converges together with A, 2; we put 
Qo = lim ¢. The vector gp, like the g®, has the norm 1 and 


thus equation (8,) does in fact possess a non-zero solution @,. 


Lemna 2. If for each n equation (95) has some number k of linearly 
independent solutions g®, y8, ..., ¢%,, then equation (8,) also has k 
linearly independent solutions gy, ¢8, ..., g?. 

Proof. We can take the solutions g?,, g8,, ..., 9%, of (92) to be 
orthogonal and normalised. We form the sequences 


0 0 0 
Pir> Pigs o> Fins +> 
0 0 0 
Pars Paes +> Pans sees 
o 710 0 
Pr> Pra aera’ Pin» aad) 


As was shown in Lemma I, each of them contains a convergent 
subsequence, and with the same convention as before we can assume 

limits ¢2. 0 
the sequences themselves to converge and their limits gf, 73, -.., ¢? 
to be non-zero solutions of (8,). Moreover, since the functions 
Pins Pans +> Pkn are orthogonal for each n, their limits 79, 78, ..., y? 


GEOMETRY OF HILBERT SPACE 257 


are also orthogonal and are therefore linearly independent as 
required. 

The totality of solutions of equation (8,) constitutes a subspace, 
which we denote by ®,. By lemma 6, Section 3, the subspace Dy 
is finite-dimensional. We denote its dimension by +. 

Lemna 3. If the completely continuous operator A can be represen- 
ted as the limit of a sequence of degenerate operators ae, tt can also 
be represented as the limit of a sequence of degenerate operators Ap 
jor each of which the space formed by its characteristic vectors coincides 
with the subspace ®,. 

Proof. Let gy, ..., g° be an orthonormal system in the subspace 
@,. We denote 


Wp=o —A,g? (¢ = 1,2, ..., 9). 
We define an operator An by the formula 


An” = AnD + PA gy). 


Evidently A, is degenerate together with the operator A,,. It is 
clear that ||A, — A,|| > 0, since hf +0, and it follows that 
|.A,, — A, > 0. We shall show that the vectors gf are character- 
istic vectors of the operator ae for 


= Ange +P — Andy = FF 


as required. 

Generally speaking, the operators A, can have more than » 
linearly independent characteristic vectors, but this cannot be the 
ease for infinitely many of them. For if it were, then considering 
only those values of » for which the operator A, has at least 
vy +1 linearly independent characteristic vectors and applying 
lemma 2, we should obtain the contradiction that the operator A 
has at least » + 1 linearly independent characteristic vectors. 

Thus only finitely many of the operators A,, can have more 
than »y linearly independent characteristic vectors. Discarding 
these and reindexing the sequence Ai. we get a sequence which 
satisfies the conditions of lemma 3. 


258 MATHEMATICAL ANALYSIS 


We can now prove that the solution subspace Vy of equation (8,) 
has the same dimensionality as the subspace ®,. 


We consider the sequence of degenerate operators A, defined 


in lemma 3 and the sequence of adjoint operators A* . Since the 
Fredholm alternative holds for degenerate operators, each of the 
operators 4* has a space of characteristic vectors of dimension 
exactly ». Since A, — A, we have also A* > A*, 

By lemma 2 there exists a system of y orthonormal solutions of 
the equation A* y) = yo. There cannot be more than y such 
solutions since, arguing in the reverse order from the operator A* 
to the operator A with » + 1 such solutions, we should get » + 1 
linearly independent solutions for the equation A gg = yo. which 
is excluded by hypothesis. 

Thus the number of linearly independent solutions of the 
equation A* yp, = yp is always identical with the corresponding 
number for A vy = Go. 

We shall now discuss the question of the existence of solutions 
to equation (8,). Let us suppose that the vector y is a solution 
of this equation and that yy is a solution of (8,). Multiplying (8,) 
by pp and using (7), we get 


(7 Po) — (Ay, Yo) = (fs Po)- 
But using (8,), we have 
(P: Yo) — (~, A* Yo) = (Y: Yo) — (P: Yo) = 9- 
so that for any solution yp of (84) 
(f, Yo) = 9. 
Thus equation (8,) can only have a solution on condition that the 
vector f is orthogonal to all the solutions of .(8,). We shall show 
that when this condition is satisfied a solution of (8,) always 
exists. 

We consider a sequence of degenerate operators A, -> A with 
the same subspace ®, of characteristic vectors as the operator A, 
its dimension being v. The existence of such a sequence was 
established in lemma 3. 

Each of the operators A* possesses the same number ¥ of ortho- 
normal characteristic vectors y}, since the alternative holds for 
degenerate operators. By lemma 2, we can assume that as n > 00 
these vectors tend respectively to the orthonormal characteristic 


GEOMETRY OF HILBERT SPACE 259 


vectors p? (¢ = 1, 2, ..., ») of the operator A*. We consider equation 
(8,) with the right-hand side jf, = f — a (f, yt?) y?. The vector f, 


is orthogonal to all the vectors y? (f, is the perpendicular dropped 
from the end-point of the vector f onto the subspace L {y7, 99, ....y7}), 
hence, applying the alternative for the degenerate operator A,,, 
we establish the existence of a vector q, satisfying the equation 


Pn — An Qn i fa: 


As n -> oo the vectors f, converge to f, since py? > y?. (f, p?) = 0 
(¢ = 1,2, ..., v). The vector @, can be chosen for any 7 so as to be 
orthogonal to the subspace of characteristic vectors of the operator 
Ay. ie. to the subspace @,. 

We shall show that the vectors ¢,, so obtained are bounded in 
the norm. Let us suppose the contrary: let the norms of the , 
be unbounded; then discarding le vectors, we can suppose 


that ||q,|| > co. Putting Gn = cae oP we get a sequence of 


normalised vectors which satisfy the equation 
(10) 


As 2 + oo the right-hand side of this equation tends to zero. 
By the same considerations as above, the sequence of vectors 
A, g, can be assumed convergent, and with it will converge the 
sequence ,. Let g) = limg,; since ||@,|| = 1, we have also 
|¢ol| = 1. Passing to the limit in equation (10), we find that the 
vector Gq satisfies the equation 


Po —- AGM =9, 


and it follows that qm is contained in the subspace ®,. At the 
same time, since all the vectors gy, are orthogonal to ®y, so also 
is the limiting vector gy. The contradiction obtained shows that 
the vectors ¢, are in fact bounded in the norm. 

Since the gy, are bounded, the sequence Aq, can, as before, 
be supposed convergent; with it will converge the sequence 


A, Gn = (An — A) Gn +AQn, and also the sequence g, = f, 
a A Qn. We denote the limit of the sequence ¢, by gy; passing 


260 MATHEMATICAL ANALYSIS 
to the limit in the equation 


Pr = fn ate (A, ooo A,) Pn + A Pn: 
we get 


p=f+Ag, 


i.e. the vector @ is a solution of equation (8,). 

Thus if the right-hand side f of equation (8,) is orthogonal to every 
solution of (8,), then (8,) ts soluble. The solution ¢ is then determined 
up to an arbitrary solution g of the homogeneous equation (8) 
and since the solution space of (8,) is of finite dimension, it can be 
chosen orthogonal to the whole of this space; such a choice then 
determines it uniquely. 

We have verified all the assertions of the alternative in case (b). 

In case (a) equation (8,) has no non-zero solutions and the sub- 
space ®, contains only the null vector. By what we have proved 
the solution subspace Y, of equation (8,) also contains only the 
null vector in this case. As we have seen, equation (8,) has a 
solution for any f orthogonal to the subspace Yj; in the given 
case this condition is satisfied by any vector f and (8,) therefore 
has a solution for any /. This solution is unique; for the difference 
of any two solutions of (8,) is a solution of (8,) and is therefore 
zero by hypothesis. 

The proof of the theorem is thus complete. 

We observe that if both the kernel K(x, s) and the free term 
f(x) are continuous, then in view of the concluding note to art. 3 
of Section 4 the solution of the integral equation (1,) is a continuous 
function. 

An important corollary of Fredholm’s theorem must be men- 
tioned. 

Fredholm’s alternative. Given the conditions of art. 1, one of 
the two cases is possible: either the complete integral equation (1,) 
has a solution for any right-hand side f(x) € L,(a, b), or the hono- 
geneous equation (1,) has a non-zero solution. 


Problems. 1. If A is an operator in Hilbert space H, mapping H on to a 
finite-dimensional subspace L, A* also maps H on to a finite-dimensional 
subspace, of the same dimensions as LD. 

Hint. The orthogonal complement of Z is mapped on to zero by A*. 
Hence A*H = A*E and has a number of dimensions not exceeding that 
of L. It follows from the symmetry of the construction that the number of 
dimensions cannot be diminished. 


GEOMETRY OF HILBERT SPACE 261 


2. Show that for any completely continuous operator A in Hilbert space H 
an orthogonal resolution H = H, + H, can be found such that H, is finite- 
or countable-dimensional, AH, C H,, AH, = 0. 

Hint. Show that the closure of the subspace AH cannot contain a non- 
countable set of orthogonal vectors. 


3. Show that every completely continuous operator in Hilbert space H 
is a limit (in the norm) of degenerate operators. 

Hint. By problem 2, we can assume that H =1,. If Aw =(&, ... 
put A, 2 = (&, .., 6, O, ...). Use problem 3 of Section 1, art. 1. 

4. Show that the adjoint to a completely continuous operator is also 


completely continuous. 

Hint. Use problems 1 and 3. 

5. Show that Fredholm’s theorem still holds if the Fredholm integral 
operator figuring in it is replaced by any completely continuous operator 
in Hilbert space. 

Hint. Use problems 1-4. 


> Sn JE l,, 


8. APPLICATIONS TO PoTENTIAL THEORY 


1. We shall assume a knowledge of the following results from 
the theory of differential equationst: 
(a) A function u(x, y) which satisfies the equation 


Au=——> + =0 (1) 


on a region G of the x y-plane is said to be harmonic. For example 
the function 


log 


———s (2) 
Vie — 8 + (y — 

which depends on the parameters &,7, is harmonic wherever the 
expression under the square-root sign is non-zero (i.e. everywhere 
except at the point # = , y = n). Putting (x, y) = P, (&y) =, 
we shall denote the function (2) briefly by 


1 
log TPQ)" (3) 


The partial derivatives of the function (1) are also harmonic 
functions for P + Q. 


+ Cf., for example, I. G. Petrovsky, Lectures on Partial Differential Equa- 
tions, Interscience, 1955, Chapter 3. 


262 MATHEMATICAL ANALYSIS 


(b) Let C be some simple closed smooth contour dividing the 
plane into two regions: an interior G; and an exterior G,. We con- 
sider a function v which is continuous and differentiable in both 
regions (with a possible discontinuity across the contour C), 
denoting by v; its limiting value as it approaches the contour 
from within and by »v, its limiting value as it approaches from 
without. 

The normal derivatives dv;/dn, dv /On have a similar sense 
(the normal is taken to be positive inward). 

If v is a harmonic function, v; = 0 implies v(P) = 0 in G; and 
dv;/0n = 0 implies v(P) = const. in G;; if, in addition, v is bounded 
as P—-> co, v, = 0 implies »(P)=0 in G, and dv,/dn = 0 im- 
plies v(P) = const. in G,. 

(c) The point Q is always situated on the contour C in what 
follows. We write / = Ig for the coordinate of a point of the con- 
tour, say the arc measured from a fixed initial point Q, to the 
point @, and w for the angle between the ray PQ and PQ,. Let 
a continuous function @ (1) be given. Then the function 


0 1 j 
| 


[the potential of a double layer density 9 (2)] is harmonic in both 
regions G; and. G,. 


Since 
F) 1 1 dr(P.Q) eee wee 
Ong 8 FPO) r(P.Q) dng r(P,Q@) °° (PQ, x) 


(4’) 
and at the same time 


dlp = r(P,Q)d ere 
Geet cos (PQ, 7) 


we get a new expression for the potential as 


o(P) = — f ell) do. (4) 
Cc 


In particular, integral (4) is seen to exist at points P of the contour 
C itself. If o (2) = 1, the function v(P) acquires a simple geometric 
meaning; it gives the total increment of the angle described by 


GEOMETRY OF HILBERT SPACE 263 


PQ when @ runs over the contour in the negative direction, so 
that 


3 1 
lox’8 eH" Qn, if PEG. (5) 
Cc 


0 1 
~ o [= j . 
lors" a, if PEC, (6) 
C 


F 1 . - 
| 58 pq) =? if PEG,. (7) 
Cc 


Thus in this case the value of v(P) in the region G; is 2 less than 
on the boundary and in the region G, it exceeds the boundary 
value by a. In the general case with an arbitrary continuous 
density 0(Q) we have 


0; (Q) = v(Q) — x0 (Q). (8) 
v,(Q) = 0(Q) + me (Q), (9) 

dv(Q) _ dv-(Q) 
an an (10) 

The function 

1 

u(P) = fea log 7P.Q) 1 (11) 
c 


[the potential of a simple layer density o(2)] is also harmonic in 
both the regions G;, G, and we have 


uj(Q) = u.(Q) = u(Q), (12) 
0u;(P) 0 1 F 
SUP) fo) 52 lox pg tl +n0(P) (PEO), — (13) 
c 
du, (P 3 1 
te = few ‘Inn log 7(b.0) di —ao(P) (PEC). (14) 


C 


We note that the potentials v(P) and u(P), which possess con- 
tinuous (v(P)) or piecewise-continuous (u(P)) normal derivatives 
in the vicinity of the boundary, cannot have any well-behaved 
(even, say, square-summable) derivatives in other directions. 


264 MATHEMATICAL ANALYSIS 


2. The following problems arise: 

(1) The first boundary problem (Dirichlet’s problem): to find 
a function v(P) which is harmonic on the region G; (the interior 
problem) or on G, (the exterior problem) and reduces to a pre- 
scribed function f(Q) on the contour C. 

(2) The second boundary problem (Neumann’s problem): to 
find a function w(P) which is harmonic on the region G; (the 
interior problem) or on G, (the exterior problem) and the normal 
derivative of which reduces to a prescribed function g(Q) on the 
contour C. 

The last assertion of section (b) can be interpreted as a unique- 
ness theorem for the solutions of these problems, viz, Dirichlet’s 
interior problem can only have one solution, and Neumann’s interior 
problem a single solution, discounting an added constant; Dirichlet’s 
exterior problem can only have one solution in the class of bounded 
functions, and Newmann’s exterior problem one solution in the same 
class, discounting an added constant. 

Fredholm gives a solution to these problems in the case of a 
contour C' with continuous curvature. 

3. The solution of Dirichlet’s interior problem is sought in the 
form of a double layer potential 


a 1 
P) = [oO se pa al (15) 
Cc 


where the continuous function o(l) is unknown. In virtue of 
equation (8) and the conditions of the problem (we denote the 
coordinate of a point P on the contour C by <) 


1 
oP) = v; = feos, Ae log 5(P0) dl — mo (x) = f(x), 


so that the function g(x) must be determined from Fredholm’s 
integral equation of the second kind with kernel 


iL 1 d 
K(,)) = x7 los ey sf : 


This function is continuous for all P,Q on the contour C; it 
follows from (4'’) that as P > Q it has as its limit the curvature 
of C at the point Q, taken with the opposite sign. 


GEOMETRY OF HILBERT SPACE 265 


In virtue of the Fredhoim alternative established in Section 7, it 
is sufficient to show that the corresponding homogeneous equation 


fol K(w, 2) dl — xe(x) =0 
c 


has only a zero solution. Let us suppose the contrary; let og (2) be 
a non-zero solution of this equation. Then for the harmonic func- 
tion 


v(P) = f K(x, Ngo(2) dl = — f ag(l) dw 
Cc Cc 


we get: 
Vig(P) = U(P) — meg(P) = 0, 


and hence by note (c) v(P) = 0 on the region G;. But then 
0v;9(P)/on = 0 as well, and by (10) 0v,)(P)/dn = 0. Since the 
function v9(P) evidently tends to zero as P tends to infinity, we 
see by note (b) that v,(P) =0 in G,. Hence v,9(P) = 0. We can 
now infer from (8) and (9) that o)(P) = 0, as required. 

Applying the results of Section 7, we get that the integral equa- 
tion (15) has a solution for any function f(P). If f(P) is continuous, 
then in virtue of the continuity of the kernel A (z,/) and the 
concluding note of Section 4, art. 3, the solution @(P) will also be 
a continuous function. Formulae (8)-(10) will then hold for this 
solution and the reduction of Dirichlet’s problem to the potential 
(15) will be valid. 

Thus Dirichlet’s interior problem has a solution for any continuous 
boundary function f (P) 

The solution of Dirichlet’s exterior problem is sought in the 
same form (15). Here we get the equation 


) 1 
v(P) = fell) los - 
c 


(P,Q) di + me(P) =f(P). (16) 


But this time the corresponding homogeneous equation 
f e@ K(@, 1) dl + x(x) = 0 


already has a non-zero solution o(x) = 1 (ef. equation (6)). This 
solution is unique (up to scalar multiples). For reproducing the 
arguments given above with the indexes t, e interchanged, we 
reach the conclusion dv ;(P)/dm = 0 and then by (b), v(P) = const. 
on G;. From (8) and (9) we deduce that o(P) = const. 


266 MATHEMATICAL ANALYSIS 


It follows by the Fredholm alternative that equation (16) has 
a solution, not for all f, but only for those that are orthogonal to 
a certain fixed function 9,(P)—the solution of the adjoint equation, 
which is unique up to scalar multiples. 

But we can also make a solution feasible for any boundary 
function f if, instead of restricting ourselves to solutions determined 
by (15), which evidently tend to zero at infinity, we also consider 
the ones obtained from these by the addition of a constant. In fact 
if f is any function defined on the boundary, we can always find 
a constant c such that the difference f — ¢ is orthogonal to the 
function g). Then by what we have proved there exists a solution 
v(P) of the exterior problem with boundary values f — c. On the 
other hand v9(P) = c is a solution of the exterior problem with 
the constant boundary value c. It follows that v(P) +- vp(P) is a 
solution of the problem with boundary values f. Thus Dirichlet’s 
exterior problem is soluble for any continuous boundary function f. 

4. We look for a solution of Neumann’s interior problem in the 
form of a simple layer potential 


dl (17) 


1 
u(P) = few log r(P.Q) 
c 


with the function 9(l) unknown. From (13) and the conditions of 
the problem 


du; (P a) 1 
os ; = few “Onp me r(P,Q) ee ee a) 


so that for the function o(P) we again have Fredholm’s integral 
equation, with the kernel 


One 9 
~ #(P,.Q)’ 


the transpose of the kernel K(x, 1) which figures in Dirichlet’s 
problem. By what we have proved, the homogeneous adjoint 
equation 


Ky (4,0) = 


Onp 
! 


0 1 
feo ae log r(P.Q) dl + aoe(xz) = 0 


has only a constant solution. By the Fredholm alternative equation 
(18) has a solution if and only if the function o(x) is orthogonal 


GEOMETRY OF HILBERT SPACE 267 


to 1, ie. 
feMdl=o. (19) 
But we know that the equation 
dul) r= 0: 
an 


c 
holds for any harmonic function «(P) on the region G;, whether 
of the form (11) or not, provided that the function du;/dn exists 
and is continuous. 

We see that the condition (19) is necessary and sufficient for the 
solubility of Neumann’s interior problem. 

Finally the solution of Neumann’s exterior problem, which we 
seek in the same form (17), leads to the integral equation 


du,(P 3 LU 
Fa [0 gag 8 app gy HE MeL) = OP 
c 


By what we have proved, the homogeneous adjoint equation 
fe@ K(x, ) dl — xg(x) = 0 
€ 


has no non-zero solutions. Hence Neumann’s exterior problem has a 
solution for continuous boundary function g(P). 

But this solution, specified by potential (17), has a logarithmic 
growth at infinity in the general case and therefore falls outside 
the uniqueness class mentioned in Section (b). It can be shown 
that potential (17) is bounded at infinity (and moreover tends to 
zero) if and only if condition (19) is fulfilled. Thus (19) is the 
necessary and sufficient condition for solubility of the Newmann 
exterior problem in the class of bounded functions. 


9, INTEGRAL Equations WirH CoMPLEX PARAMETERS 
1. Complex Hilbert Space 


It is often necessary in analysis to consider functions which 
assume complex values; it is natural to try to construct from 
them a space with scalar products. But axioms (a)~(d) of Sec- 
tion 1, art. 1 cannot be simply carried over to the new scalar 
product, for by axiom (d) the expression (i x, 7 x) must be positive, 
while by (a) and (c) it must be equal to—(z, 2), i.e. negative. 


268 MATHEMATICAL ANALYSIS 


We reformulate axiom (a) in the complex space as follows: 
(a') (y, x) = (x, y), where the bar denotes complex conjugation. 

We can then preserve the remaining axioms: 

(b) (vt, y + 2) = (2, y) + (x, 2); 

(c) (Aw, y) = A(x, y) for any complex A; 

(d) (a, x) > 0 for x = 0, and (x, 2) = 0 fora = 0. 

From axioms (a’) and (c) we get a new rule for carrying a com- 
plex multiplier in the second member of a scalar product through 
the product sign: 


(x, Ay) = (Ay, x= Ay, x) = Aly, x)= A(z, Y)> 


ie. a complex multiplier in the second member of a scalar product 
can be carried through the product sign under conjugation. 

An example of a complex Hilbert space is the space L, (a, b) of 
complex functions g(x) with a square-summable modulus; the 
scalar product is defined by the formula 


b 
(pv) = f plo) pl@) de. 


Another example is the space J, of complex sequences 7 = 
(€,, &, ...) in which the squares of the moduli form a convergent 
series: 


3S [Ei < 0. 
n=l 


The scalar product of elements « = (&, &, ...), y = (My, No, ++) 
is given by the formula 


(z, y) = Ss En Mn+ 


All the basic results of this chapter carry over to the case of 
complex Hilbert spaces with more or less obvious modifications 
to formulations and results. Thus for the Cauchy—Bunyakovsky 
inequality (Section 1, art. 3), we proceed as before from the ine- 
quality 

(Ax —y,Ax —y) 20, 


which holds for any complex 2. Expanding the left-hand side, we 
get 
AA (a, x) — Ala, y) — Aly, e) + yy) 29. 


GEOMETRY OF HILBERT SPACE 269 


We put 4 = te! 8 (¢ real), and then this inequality trans- 
forms to the form 


P(x, x) — 2t\(a,y)| +, y) 20, 


so that, as before, 
i(w, ¥)| S |x| yl. 


If we have areal Hilbert space H, we can always construct a ‘‘com- 
plex extension” H of it from the formal sums 2 + iy, where 
x € H, y €H. In the space H linear operations of addition and 
complex multiplication are introduced in the natural way; we 
also define a scalar product by means of the formula 


(ty + 0Yy, Ug + tYg) = [(a,, %) + (Yr> Yo)] + t0(Y1, Ve) — (21, Yo)]- 


It is easily verified that it satisfies conditions (a’), (b), (c), (d). In 
particular, 
(w+ ty,@ + ty) = (x, 2) + yy). 


The space H contains the space H as a subspace (under real 
multiplication only!) with the same scalar product. 

The complex spaces we have introduced, L,(a,6) and J,, are 
evidently the complex extensions of the real spaces L,(a, 6), ly 
considered earlier. 

Every complete orthogonal system ¢,, ..., e,, ... in H will also 
be a complete orthogonal system in HA; it (e+ 7Y, en) = 0 for all n, 
then for all n 


(x, €n) = 0, (y, en) az 0, 


and therefore w = 0.y=0,% +7iy =0. 

Of course there exist new orthonormal systems. Thus in the space 
L,(—s, 2) the system of functions 1/)/(2z) e'"2(n = 0, +1, £2, ...) 
is an example of a complete orthonormal system. Its completeness 
follows from the fact that each function of the complete system 1, 
cos a, sin x, ... is a linear combination of the functions e?”2. 

In complex Hilbert space the development of a vector f in an 
orthonormal system e,, @9, »--, €n, --. has the form 


co 
{= Dy €n en: 
n=l] 


270 MATHEMATICAL ANALYSIS 


where ¢,, = (f, €n) = (én, f). In particular the Fourier coefficients 
in the space L,(—7, 2) for the development 


f(x) = Sen eins 


are given by the formula 


The fundamental theorem of Section 2, art. 4 carries over to 
the complex case with the single modification that in Parseval’s 
equation (7) and Bessel’s inequality (5) | (f, e,)|? replaces (f, e;,)?. 

A linear operator A which has as its domain a complex space H 
is said to be symmetric if, as before, it satisfies the equation 


(A x,y) = (x, Ay) 

for any x, y in H. In general, linear operators in complex space 
can have characteristic vectors with complex characteristic values. 
But a symmetric operator A cannot have non-real characteristic values. 
For let A x = 24; then 

(A @, v7) = (Ax, x) = Ala, x), 

(w, Ax) = (a, Aa) = A(a, x) 
and it follows from the equation (A a, x) = (a, A x) that A = 7 is 
real. 

It is easily verified that Fredholm’s integral operator 


b 
Ap = { Ke, 8) p(s) ds 


will be a completely continuous symmetric operator on the com- 
plex space L,(a, b) if the kernel K (x, s) satisfies the conditions 


bb 
f f|K@ 8)? dads < oo, K(x,s) = K(s,a). 


Every linear operator A which has as its domain a real space H 


can be extended to operate over the complex extension H by meas 
of the formula 


A(ut+iy) =Ax+tiAay. 


GEOMETRY OF HILBERT SPACE 271 


Tf in addition the operator A is symmetric on H, then the opera- 
tor 4 will be symmetric on H: 
(A(x, + 7 Y4), T, + % Ye) 
= (A &, 2) + 1(A Y, X%) — 1(A 2, Ye) + (AM, Yo) 
= (#1, A x) + 2(y,, A) — i(@,, A Ye) + (y:, A Ye) 
= (a + 2y,, A(@, + 1 yo) 
The theorems on the existence of characteristic vectors (Sections 3 
and 4) and the solution of integral equations (Section 6) carry 
over to the complex case without change. 
The Fredholm theorem (Section 8) remains valid as it stands 


for an integral operator with a square-summable kernel on a com- 
plex space L,(a, b): 


b 
Ag = {| K(w,s) p(s)ds; 
the adjoint operator A* is now defined by the formula 
b 
A*y = {Ke x) p(s) ds. 


2. We can obtain some new information on the general properties 
of operators if in place of the single equation 


p=Ag+f 
we consider a family of equations with a complex parameter yw: 
p=pAg +f. 
This family of equations can be written in the form 
(E—wA)p =]. (1) 


The operator A is supposed completely continuous as before, e.g. 
the Fredholm operator, but not in general symmetric. Applying 
the Fredholm alternative to equation (1), we get: for a fixed value 
of w either equation (1) has a unique sojution for any f (the value 
of yz is then said to be regular), or the homogeneous equation 


(E—-wA)p =9 


has a non-zero solution gy which will evidently be a characteristic 
vector of the operator A with characteristic value 2 = 1/u (the 


272 MATHEMATICAL ANALYSIS 


value of u is then said to be singular). As yw varies both these 
possibilities can be realised, but it turns out that the first is the 
rule and the second the exception: more precisely, a completely 
continuous operator A can possesses only a finite number of distinct 
characteristic values the moduli of which exceed a given positive 
number. We have already established this for a completely conti- 
nuous symmetric operator in Section 3; we shall now show that 
it is true for any completely continuous operator. 

Let us suppose that the completely continuous operator A has 


an infinite set of characteristic values 4,, 4,, ... une moduli of 
which exceed the positive number 6; let g,, go, ... be the corre- 
sponding characteristic vectors. Orthonormalising the sequence 
Jr» Jo, --, We get a new sequence €,, ég, ..., €,, --. The vectors é, 


are in general no longer characteristic for the operator A; for if 


Cn = Ann Jn + G@ayn-1 Gnade ah G1 91> (2) 
then 
A Cn = Bann hn Gn at Bnyn-t An-1 Yn-1 a Ony Ay f1- (3) 
The equation 


n-} 
A en = An €n + Py An}: (Ax ce An) Jn > 


which derives from (2) and (3), determines a resolution of the vec- 
tor A e, into a component lying in the subspace (g,, ... g,_1) and 
a component orthogonal to this subspace. The length of the latter 
component is equal to |A/,|. Thus the distance of the vector 4 e, 
from any vector in the subspace (9,, ..., J,_;), in particular from 
the vector A ¢,, for m <n, exceeds |/,,| 26. But then it is im- 
possible to extract a fundamental sequence from the sequence 
Ae,, A és, ..., which contradicts the complete continuity of the 
operator d. 

Thus the characteristic values 7, of any completely continuous 
operator form at most a countable sequence which converges to 
zero. Hence the singular values uw = 1/4,,, for which the equation 
(Z — w A)q@ = 0 has a non-zero solution, form at most a coun- 
table sequence which diverges to oo. 

Let us consider a regular value yw for which the equation 
(E — w A)ep = f has a unique solution ¢ for any f € H. We denote 
the solution ¢, as a function of f, by A, f. The operator R,, is 
evidently linear. We shall show that it is bounded. Let us suppose 
the contrary: for some bounded sequence f,, the vectors y, = R, fr 


GEOMETRY OF HILBERT SPACE 273 


become unbounded in the norm. We put 9,/|Q,| = en; fal|l@n| = Gn: 
the vectors e, have the norm 1, the vectors g, tend to zero. The 
sequence A e, contains a convergent subsequence; by renumbering 
we can assume that the sequence A e, itself converges. But then 
the sequence e, = g, + « A é, converges to some vector e, je| = 1. 
Since g, > 0, we have e = uw Ae and y is a singular value, in 
contradiction to the hypothesis. Thus the operator R,, is bounded. 

Since the relations (E —yA)p =f and g = R,f are equi- 
valent, the operator , is the inverse of E — wA. It is said to be 
the resolvent operator or the resolvent of the operator A. 

3. The question arises as to how an explicit expression for the 
resolvent can be constructed. 

We apply the fixed point principle (Chapter II, Section 5) to 
the operator By =u Ag +f. We recall that in order for an 
operator B which has as its domain a complete space to have a 
fixed point it is sufficient that it should be compressive, i.e. that 
for some constant 0 < 1 it should satisfy the inequality 


|By — By| S6|p — y|. (1) 
In the given instance 
[Bo —-By|=|pAg—pAy| Sle lAlle-—y 


and the sufficiency condition reduces to |u| < 1/|| Al]. Hquation (1) 
of art. 2 is then soluble and for sufficiently small yu the solution is 
unique. 

As we showed in the same section, Chapter IT, Section 5, the 
actual solution is the limit of the sequence 9, Bg, BY, ---, Brgy, 
..., Where the initial vector gy is arbitrary. Putting gy) = 0, we get: 


Bo =f, eo =eAt+t. 
Bg, =pAwAftftf=wWAa f+ wAfth -- 
Boog =f + mAt+pPAtf +--+ qr-tAn-ly, 


The convergence of this process is equivalent to the convergence 
of the series 


fipAfi+ WA + (2) 


Thus for |u|] < 1/||Al| equation (1) of art. 2 has a unique solution 
in the form of the series (2). 


274 MATHEMATICAL ANALYSIS 


It follows that the operator R, can be expressed for |u| <1/| All 
in the form 
R,=E+pA+ QA? -- (3) 


Note. The solution could be obtained by means of the formal 
expansion of the expression (ZH — 4 A)~1 in a power series in ys: 


(E-—wAyi=B+pA+ war + - (4) 
The series on the right-hand side of (4) converges for all yw with 
|u| <1ff All, since jw" A"! < |u|" |All. Qleft or right) multi- 
plication by # — uw A gives H, so that the operator represented 


by it really is the inverse of H — yw A. 
If A is Fredholm’s integral operator 


b bb 
Ag = { K(w,s)p(s)ds, f {| K?(w,s)|dads = K?, <0, 


then for |“| < 1/K, as we shall now show, the operator R,, is of 
the form # + I',, where I’, is a Fredholm integral operator the 
kernel of which depends on the parameter yu. 

For the proof we reason as follows. Let there be given two inte- 
gral operators: 


Ag= J (2,8) 919 as | ff1keo)|deds = 1 
Hom (eee [fitr@slaeds = 2. 
We shall construct the operator 4 B. We have: 
Y 
ABy =[K(, of J L(s, nytoathas 


b b 
=f 1 J Ree s) L(s, nash ptoar, 


Reversing the order of integration is permissible in virtue of 
Fubini’s theorem, applied to the summable function of s and ¢ 


K(x, $) Lis, t) y(t) ’ 


GEOMETRY OF HILBERT SPACE 275 


which is the product of two square-summable functions L(s, t) 
and K(x, s) y(t). We further denote 


b 
M (x,t) = [ K(x, s) L(s, t)ds. 


By the Cauchy—Bunyakowsky inequality 


b b 
| M(x, t)| < f | K%(w,s)|ds [| L7(s,t)\As, 
so that M (x, ¢) is square-summable and 
bob 
M? = f{ {| MQ, t)| deda 


bb ra 
<f [| Ke, s)idsda f [|Z(s, t)|dsdt = K?L. 


aa 


Thus the operator AB is a Fredholm integral operator the kernel 
M (x, t) of which satisfies the inequality 


MEKL., 


It follows that each of the operators A?, A, ..., A”, ... is an inte- 
gral operator together with A, and the kernel X,, (x, s) of the opera- 
tor A” will satisfy the inequality 


bb 
Ki = [ [| K2(c,s)|dads s kK, 


a 
v 


b 
where K? = ek | K?(a, s)| dads. 


aa 


On the space L,(G), G = {fa <a Sb,a Ss Sd}, the series 


pw K (a, 8) + pw? K7(@, 8) + + + ue A (ae, 8) + (5) 
is majorized in the norm for |u| < 1/K by the convergent series 


and therefore converges in the mean to some function I'(z, s, u) 
which is square-summable for each yp, |u| < 1/K. But since for 


276 MATHEMATICAL ANALYSIS 


integral operators we have 


bb 
\Al? < ff K2(w,s) dads = K?, 


at 


the convergence in the mean of the series (5) implies the conver- 
gence of the series of operators uw A -+- yw? A? + +--+; with the ad- 
dition of the unit operator the sum of this series is the resolvent R,,. 
Thus for |u| < 1/K the operator A, is the sum of the unit operator 
and an integral operator with the kernel I"(x, s, 4), as required. 

The condition |u| <1/K is evidently not necessary for the 
solubility of equation (1). If the series (5) converges, it always 
determines a solution, but it can converge over a wider range of 
values of uw. For example, if some iterate of the kernel K (a, s) 
vanishes, i.e. 


Ky, (2, 8) =0 


for some m, then (5) converges for all wu. An example of such an 
operator is an operator with the kernel K(x, s) = p(x) q(s), where 
the functions p(x), g(x) are orthogonal. For then we need look no 
further than the second iterated kernel 


b 
K,(x, 8) = f p(w) q@ pit) q(s) dé = 0. 


More generally the series (5) will converge for all y if the iterated 
kernels satisfy an inequality of the form 


Ch 
[Kn (a, s)| =r (6) 

Tn all these cases the resolvent R,, exists for all values of and 
all values of 4 are regular. By way of example we consider the 
Volterra operator 


A p(x) = f Kw, s)p(s)ds (a =a <b) 


with a bounded kernel K(az,s), |K(x,s)| <M. The Volterra 
operator can be regarded as a particular case of Fredholm’s integral 
operator, when the kernel K(x, s) vanishes on the triangle s = x; 
we therefore have an integral between the limits a, x instead of 


GEOMETRY OF HILBERT SPACE 277 


a, « instead of a, b for s. We have: 


|K(z, s)| <M, 
6 
|Kq(x, 8)| =| f K(w, t) K(t,s) de 
-| [K@) Kus) au| gare ~o (a = 8), 
0 (« $s), 
b 
|K,(x,8)| =| f K(x, t) K(t, 8) de 


+ 


f Kale, t) K(t, s)atl <- —s? («=s), 
8 | 


0 ( Ss), 
and so on, so that for any n 
oie ee — g\r-1 > ; 
rae ee =e Ty (x — s) (x = 8) 
=0 (z Ss). 


Hence 
bb ae 
wt 
2 ep ee a 
f [\xde. s)| dvds s Gi 


and inequality (6) is satisfied. 
In the general case, the series. which determines the resolvent 


R,=E+pA+ PA to, 


has a finite circle of convergence, outside which it does not allow 
the resolvent to be directly calculated. We give (without deri- 
vation) Fredholm’s formulae, which give an expression for the 
resolvent for any non-singular value yu in the case of a bounded 
continuous kernel K (x, s)t. We fix values x1, ..., %, and 81, ..., 8, 


t+ See e.g. I. I. Privalov, Integral equations, Gost (1937), Chapter II. 
T. Carlemann (1921) showed that these formulae remain in force for a 
square summable kernel. A simple proof (with a generalisation to an infinite 
interval) may be found in 8. G. Mikhlin, Dokl. Akad. Nauk SSSR, 92, No. 9, 
387-90 (1944). 


ma. 10 


278 MATHEMATICAL ANALYSIS 


and introduce the notation 


We define two functions D(y), oe 8, 4) by the formulae 
bwin fel Jee falielounes 


sae za) ‘x (3 e a) ag, ede a 


Devan =#(2) af * (Epo 
bb 


+E ffx (ee) de dé + . 


He funfx ce wilde dbs + 


Both of these functions are entire analytic functions of 4“; more- 
over the function D(u) vanishes for precisely those yw that are 
singular values of the operator A (when the resolvent R, does not 
exist). For any regular value of u the solution of the integral equa- 
tion 


b 
g(x) + wf K(w, 8) p(s) ds = f(x) 


can be expressed in the form 
(x, 8, ay 
y+ of Poet f(s) ds. 


Thus the resolvent #&, of the operator A is, whenever it exists, the 
sum of the unit operator and Fredholm’s integral operator with 


GEOMETRY OF HILBERT SPACE 279 


the ratio of the Fredholm functions D(a, s, 4), D(u) as kernel. The 
characteristic functions of A can also be expressed in terms of 
D(u) and D(x, s, 4); however these formulae are somewhat intrac- 
table and we shall not give them here. 


Problems. 1. Show that the set of solutions of the equation 
P(A)yg=(A-A, EB)... (A-A, E)p=0 


(Ay, Ags ---9 Am distinct complex numbers) coincides with the set of linear 
combinations of the characteristic vectors of the operator A which corre- 
spond to the characteristic values 2,, Ag, ---5 Am: 
Hint. The development in partial fractions 
ay an 


Ae to Ned ee 


has an operator analogue 
H=a,(A—A, EB)... (A—Ay BE) + 4+ dy (A-, B)...(A-A,-1£). 


2. If the operator A™ has a characteristic value A, then the operator A has 
a characteristic value y equal to one of the m roots of A. 

Hint. Use problem 1. 

3. If the operator A is symmetric and A” is completely continuous, then A 
is completely continuous. 

Hint. The operator A has a complete orthogonal system of vectors with 
characteristic values which tend to zero. Use problem 5 of Section 3, art. 5. 

4. Show that an operator A defined on an orthonormal basis ¢,, é, ..., 
€n, +» by the formulae 


A Co,_1 = €q Aly, =O (K=1,2,...), 


is not completely continuous, while A? is completely continuous. 

In problems 5-8 the operator A™ is assumed to be completely continuous 
(A itself may not be). 

5. Prove that the characteristic values of A form at most a countable 
sequence which converges to zero. 

Hint. Use problem 2. 

6. For sufficiently large prime ~ the characteristic vectors of A?-corre- 
sponding to the characteristic value 1 are precisely those of A corresponding 
to the characteristic value 1. 

Hint. Use problems 2 and 5. 

7. The dimension of the subspace of characteristic vectors of A corres- 
ponding to the characteristic value 1 is the same as that of the corresponding 
subspace for A*. 

Hint. Use problem 6 and the complete continuity of A? for p< m. 

8. The equation g-— Agy=f has a solution if and only if (/, y) = 0 
where y, is a characteristic vector of A* with the characteristic value 1. 

Hint. consider the equation 


(B— Ar) p =(E+ At +A?) f 
10 


280 MATHEMATICAL ANALYSIS 


where p”is such that none of the complete p* roots of 1 is a characteristic value 
of the operators A or A*. Verify that it has a solution. If g is a solution, 
then (ZH — A) g — f is mapped onto zero by the operators H + A + -- + A?7}, 
Applying the result of problem 1, obtain the relation (7 — A) my — f=0 
(S. L. Soboliev). 

Note. Problems 7 and 8 show that the Fredholm alternative is valid for 
operators A for which some power A” is completely continuous. 

9. Let yz be a non-singular value of a completely continuous operator A 
(i.e. HE — uA=B has an inverse) and let Q = # — « B* B, where « > 0 
can be taken arbitrarily small. Then the sequence 9,41 = Q 9, + « B* 
with arbitrary converges to a vector g which satisfies the equation 
(H~ wA)p=f. 

Hint. The equation (7 — u A) gy =f is equivalent to y= Qqm + x Bt f. 
Use the result of Section 3, art. 3, problem 5 (with C = B*B) and the fixed 
point method (I. P. Natanson, 1948). 

Note. The result of this problem suggests an iterative procedure for solving 
the equation (#7 — u A) y = f for any non-singular value of yu. 

10, Show that if an equation of the first kind, A g = f, has a solution, it is 
the limit of a sequence 


Pn = Pn (H — pA A*) + ef, (1) 


where 0 < p < 2/|[.A|/? (B. M. Friedmann). 

Hint. Substitute y, =o+m, in (1) and obtain the formula wu, = 
= (H-—pAA*)4u,_,. If e, is an orthonormal system of characteristic 
vectors of the operator A A* with characteristic values 4,, then (w,, ¢) 
= (1 — #A,) (U,_1, &) = (1 — HA,) (up, &). Choose p so that |(upo, e,)|? 
+ |(to» &p41)| + ++ < ¢ and obtain the result that, for sufficiently large n, 


fos) p-1 head 
| ty |? = 3 | (ts &4) |? SZ (1-day | to |? +2) (uo, @;|? < 2e. 
i=pP = = 


Concluding Remark 


The theory of integral equations with a variable upper limit was 
developed in 1887 by V. Volterra (Italian mathematician, 1860-1940). 
In the years 1900-1903 a series of fundamental papers by E. Fred- 
holm (Swedish mathematician, 1866-1927) appeared and in these 
he introduced the entire functions D(u), D(z, s, 4) in terms of 
which he expressed the solution of the general equation (‘‘Fred- 
holm’s equation’’) and the characteristic functions. In papers of 
the years 1904-1910 D. Hilbert (German mathematician, 1862- 
1943) first related geometrical concepts to integral equations, 
regarding the problem of characteristic functions as the problem 
of reducing a quadratic form of infinite rank to principle axes. 
“Hilbert space” is one of the most important mathematical con- 
cepts of the twentieth century. Hilbert determined the canonical 


GEOMETRY OF HILBERT SPACE 281 


development of a bounded symmetric operator (the theorem 
proved in Section 3 is a special case relating to Hilbert space), 
which became the starting-point of the contemporary spectral 
theory of linear operators, of wide application in mathematics and 
physics. The class of completely continuous operators (in Banach 
space) was first distinguished by F. Riesz in 1919. For further 
applications of integral equations in mathematical physics, cf.: 
8. G. Mikhlin, Integral Equations, Pergamon, London, 1951; for 
spectral theory and its applications cf.: N. I. Akhiezer and 
I. M. Glazman, Theory of Linear Operators in Hilbert Space, 
Ungar, New York, 1961, and M. A. Naimark, Differential Operators, 
State Tech. Pub. Dept., 1952. 


CHAPTER VI 


DIFFERENTIATION 
AND INTEGRATION 


Ir 18 well known that in the classical analysis of continuous func- 
tions differentiation and integration are inverse operations. The 
precise sense to be attached to this assertion is the following: 


A. Given a continuous function g(z), its indefinite integral 
F(x) = fog) dé + (1) 


where C is an arbitrary constant, is a differentiable function and 
for each x 


F' (x) = (2). (2) 


B. If F(x) is a function which possesses a continuous derivative 
then 


g(x) = F'(e), 
fe)dé = Fe) +e, (3) 


where C is a certain constant (viz. — F(a)). 

We now propose a much more general concept of the integral, 
which can be applied to a wide class of discontinuous functions. 
The continuity clause for (x) will no longer hamper us in construc- 
ting the indefinite integral (1). 

The following problems arise: 


Problem I. Does equation (2) hold when ¢ (7) in (1) is an arbitrary 
summable function ? 

Since F(x) remains unaltered under an arbitrary modification 
of the function (x) by a set of measure zero, it is natural to require 


282 


DIFFERENTIATION AND INTEGRATION 283 


in answering problem I that equation (2) hold not everywhere, but 
only almost everywhere. 


Problem IT. Under what conditions does a given F(x) possess a 
summable derivative g(x) (even if defined only almost everywhere) 
and. does formula (3) hold ? 

In this chapter we shall consider the answers to problems I and IT 
and also certain closely related topics, chief among which is the 
construction of the Stieltjes integral. 


1. DERIVATIVE OF A NON-DECREASING FUNCTION 


1. We first consider the problem: is the existence of a derivative 
an elementary property of a function f(x) or can it be deduced 
from more elementary properties? It is natural to regard e.g. 
continuity as a possible more elementary property. A function f(z), 
having a derivative at a point x, is well known to be continuous 
at 2). Is the converse true, in other words, must a continuous 
simple function f(x) have a derivative? Of course the answer is no; 
very examples of the type f(x) = |z| show that a function can be 
continuous without having a derivative everywhere. It still seems 
conceivable that points where there is no derivative may be excep- 
tional for a continuous function, and that any such function must 
always have a substantial set of points at which it has a derivative. 
This was the opinion of many mathematicians at the beginning 
of the last century. In the end, the answer proved to be in the 
negative: Weierstrass’ famous example (of 1860) of a continuous 
function} with no derivative at any point surprised the mathe- 
matical world and put an end to the attempts to find the points 
at which a continuous function must be differentiable. (Van der 
Waerden’s simpler example is given in one of the problems on this 
article.) Thus the continuity of a function does not imply its diffe- 
rentiability. 

Let us try to approach the problem from another direction. A 
function f(x), having a derivative f'(x)) > 0 at x = 2p, is not 
decreasing in the neighbourhood of a, in the sense that we have 
f(x) > f(a) if x > a, and f(x) < f(x) if « < xp, for x sufficiently 


f+ The Czech mathematician, B. Bolzano, constructed a similar example 
in 1830. The manuscript was only discovered in 1920, and published in 
1930—a hundred years after it was written. 


284 MATHEMATICAL ANALYSIS 


close to a. Is it possible that the monotoneness of a function 
implies the existence of a derivative? If we are speaking of indi- 
vidual points, the answer is again no: for instance, f(z) = |x| + 2” 
is increasing everywhere, but has no derivative at x = 0. However, 
the points where a monotonic function has no derivative are in 
fact the exception, not the rule. In fact, as shown by Lebesgue 
(1902), a non-decreasing function can only lack a derivative on a 
set of measure zero. 


THEOREM | (A. Lebesgue). A non-decreasing function F(x) de- 
fined on the closed intervala <a <6 has a finite derivative almost 
everywhere on this interval. 

We begin by giving the proof for the case when the function F(x) 
is continuous. 

Let G(x) be a function on the closed interval a <x <b. We 
call a point x € [a,b] a point of ascent (Fig. 11) if there exists 
a point & to the right of x on [a, 6] at which G takes a greater value 
than at x: 


G(x) < @(€) (<6). (1) 
We establish the following lemma: 


Lemma (F. Riesz). The set of all points of ascent of a continuous 
function G(x) is open on the closed interval [a, b] and at the end-points 
of each component open interval (a, b;,) of this set the inequality 

G(a,.) S G(b,). (2) 
holds. 


f(x) 


(a,,)) Intervals 
b of points 
(2-62)! of ascent 


Fra. 11 


Proof. Since the function G(x) is continuous, it is evident that 
all points 2 sufficiently close to a point of ascent are also points 


DIFFERENTIATION AND INTEGRATION 285 


of ascent; thus the set Z of all points of ascent is open. Let (a, 5;) 
be some component open interval of this set. We shall show that 
for any x € (a,, ,) we have G(x) < G(6,); then proceeding to the 
limit as « > a, we get the required inequality (2). 

Let us suppose that G(2) > G(b,); we find the point 2% 
furthest to the right in (a,, 5,) for which G(x) = G(x,). Then 
since G(x) = G(x) > G(b,); it follows that G(x) > G(x) every- 
where in the interval (x, 0;,). Further, the point , does not belong 
to the set of points of ascent and so everywhere to the right of 5, 
we have G(x)) S< G(b,). Consequently we get G(x) > G(x) every- 
where to the right of 2. But then 2 cannot be a point of ascent, 
in contradiction to the construction. 

Note. We shall call a point with the property (1) more precisely 
a point of ascent to the right. A point of ascent to the left can be 
defined analogously as a point to the left of which exists a point € 
such that 


E<a, G(é) > Ga). 


Just as above, it can be proved that the set 6 points of ascent to 
the left is open and that at the end-points of its component open 
intervals (a, 5) we have 


G (a) 2 G(b,). 


Before proceeding to the proof of the Lebesgue theorem itself, we 
shall make a few further observations of a general nature. 
The derivative of a function F(z) is defined as the limit of the 
ratio 
F(@) ~ F(a) 
pera. (3) 


as € approaches x in accordance with any rule. Of course this limit 
may not exist. But in every case the following quantities (for 
which we admit infinite values) always exist: 


A,—the limit superior of the ratio (3) as € approaches x from 
the right (the upper right-derived number); 


A,—the corresponding limit inferior of (3) (the lower right- 
derived number); 

A,—the limit superior (3) as € approaches x from the left (the 
upper left-derived number); 
MA. 10a 


286 MATHEMATICAL ANALYSIS 


4,—the corresponding limit inferior of (3) (the lower left-derived 
number). 


¥ 


Fie. 12 


Figure 12 illustrates a case in which all for derived numbers are 
finite and distinct at the given point 2. It is left to the reader to 
render graphically the case in which all four values are prescribed 
at the given point with —o <4, <A, S +0, -oS4, 54, 
S + 00, 

If A, = 4, += + oo, the function F(x) possesses a derivative to 
the right at the given point; if A, = 4,-+- + ov, it has a derivative 
to the left. Finally if all four derived values are finite and equal, 
it has a derivative at the point in question. 

Bearing all this in mind, we set about proving Lebesgue’s 
theorem. 

We recall that the theorem stipulates a non-decreasing function: 
it is always the case fora < & that F(x) < F(é). Hence the ratio (3) 
is non-negative, and together with it all four derived numbers 
A,, 4,, Ay, 4 are non-negative at every point. 

We shall show that almost everywhere 


A,< +0. 


If at some point x we have A, = + o, then for any C we can find 
a point & to the right of x at which 


F(é) — F(x) So) 
&—«2 , 


or, what is the same thing, 


F(x) —~ Cu < F(&) — CE. 


DIFFERENTIATION AND INTEGRATION 287 


Thus every point x at which A, = + co is a point of ascent to the 
right for the function G(x) = F(x) — Cx. By Riesz’ lemma the 
set of all points of ascent to the right is open and at the end-points 
of its component open intervals 


F (ay) = C a, s F (b;) —Cb;, 
or, what is the same thing, 
C (by — My) S F(b,) — F(a). (4) 


Summing the inequalities (4) over all component open intervals, 
we get 


CS) (be — o%) S SP (hx) — F(@)) SF) — F@). 


We see that the set Z of all points # at which A, = + c can 
be covered by a system of open intervals of overall length 


3 be %) SFO) — Fea]. 


Since C can be chosen arbitrarily large, we conclude that the set Z 
has measure zero, as asserted. 
The next step in the proof is to verify that almost everywhere 


A, SM. 


The set of points where A, > /; can be represented in the form of 
a countable sum of sets Zc, determined by the inequalities 


Ar<ce<C<A, 


where c, C range over all possible pairs of rational constants (¢ < C). 
It is therefore enough for us to prove that each of the sets Z,¢ has. 
measure zero. 
Let x € Zc. Then, since 2, < c, there exists a point & lying to. 
the left of « at which 
F()— Fe) 
Eé—«2 


We observe that here € — x < 0; hence (5) is equivalent to the 
inequality 


(5) 


F(é) -—c& > F(x) — cx. 


Thus x is a point of ascent to the left for the function G(x) 
= F(x) — ex. Applying Riesz’ lemma (cf. the note on p. 285), we 


10a* 


288 MATHEMATICAL ANALYSIS 


get for the component open intervals of the set of all points of 
ascent to the left inequalities of the form 

F (a,) — ca, 2 F(b,) — chy, 
or, what is the same thing, 

F(b,) ~ F(a) S €(bp — %). (6) 


The point x in question lies in one of the given open intervals 
(a;,, 6;,). Since at this point A, > C, we can find a point € > & in 
(ay, by) at which 


F€) — F(a) 
a ore (7) 


We shall carry out the construction that follows within the interval 
(a, b,). As above, inequality (7) shows that x is a point of ascent 
to the right for the function F(x) — Cx. The set of all points of 
ascent to the right for this function in the interval (a;, b,) is open 
and has a decomposition into the sum of component open intervals 
(a,j, 5:3) Gj = 1, 2, ...), at the end-points of which we have 


F (a,j) — Cay; SF (byj) — C dy;. 
In other words 


F (6x3) — FP (Qj) 2 C(Onj — nj). 


Summing over the index 7, we get 
3 aj ~ M)) SEU as) — FO) SGC) — Fe 
Using inequality (6), we get 
3 nj — Oj) SGT Oe — OH). 
Summing over k, we find 
EE (bj ~ ti) SE Oe ~ a). (8) 


The system of intervals (a;,;, 0;,;), like the system (a;, b;,), covers 
the set Z,c and we see that the first system covers it more “‘econo- 
mically’’ than the second. 


DIFFERENTIATION AND INTEGRATION 289 


This construction can be repeated for each point x € Z,¢ in the 
corresponding open interval (a,;, 0,;). We obtain a new “‘third- 
order” system (@4 jm» Onj m) (m = 1, 2, ...) with and a fourth-order 
system (Aj mn» Onjmn) (m,” = 1, 2, ...) 


c c 
my (knjmn — %jmn) S&S Gea (Ok jm — Ujm) S € (nj — &j)+ 
mn m 

Summing over k, 7 and using (8), we get: 


2 
2) 121) (Oxj mn — Ujmn) S (3) oD (0% — &%)- 
kjmn k 

Continuing this process, we can cover the set Zc with still finer 
systems of open intervals, the overall length of the 2p" covering 
not exceeding (c/C)? (6 — a) and this quantity can be made arbi- 
trarily small by taking p sufficiently large. The set Z,¢ therefore 
has measure zero, as asserted. Thus for any non-decreasing function 
the inequalities A, < + co, A, <A, hold almost everywhere. Let us 
replace the function F(x) by F*(x) = —F(—2); the function 
F* (x) is also non-decreasing, and again the inequality A* < A 
holds almost everywhere. But it is easy to see that at corresponding 
points A* = A,, Af = /,; hence the inequality A; < 4, holds almost 
everywhere. Thus we get the chain of inequalities 


0s4, Ay A, S41, 5A4,< +0, 


which hold simultaneously on a set of full measure; we see that 
on this set 


054,=4,=A,=14,.<+ 0, 


i.e. the function F(z) has a finite derivative. 

Lebesgue’s theorem is therefore proved for a continuous non- 
decreasing function. 

We now return to the case of discontinuous non-decreasing func- 
tions. 

We observe that an arbitrary non-decreasing function F(x) can 
only have discontinuities of the first kind, so that F(x) has limiting 
values on both right and left at every point: 


F(a + 0) = lim F(é), F(x — 0) = lim F(é). 
iss ie 


290 MATHEMATICAL ANALYSIS 


For a multiplicity of limiting values on either side would contradict 
the monotoneness of the function. The interval (F(a — 0), F(z + 0)) 
is said to be the interval of discontinuity and its length 
F(x + 0) — F(# — 0) the saltus of the function F(x) at the point x. 
Since F(x) is non-decreasing, the intervals (F(a — 0), F(# +. 0)) 
corresponding to distinct points of discontinuity are non-over- 
lapping (at most they can have an end-point in common); hence 
the set of such intervals is at most countable. It follows that the 
set of discontinuities of a non-decreasing function is at most countable. 

To verify the existence of a derivative for a discontinuous non- 
decreasing function we generalise Riesz’ lemma accordingly. Let 
G(x) be a function having at worst discontinuities of the first kind. 
We shall call a point « a point of ascent to the right if there exists 
to the right of x a point € at which 


max (G(x), G(a — 0), Gia + 0)] < G(é). 


Repeating the arguments adduced above in support of Riesz’ 
lemma we get that the set of all points of ascent to the right is 
open and for each of its component open intervals (a; 5) that 


G (ax + 0) <= max {G (b,), G (by = 0), G (by, -+- 0)}. 


But this is already enough to carry through unaltered the proof 
of the theorem itself. 

Thus every non-decreasing function has a finite derivative almost 
everywhere. 


Problems. Prove the following propositions: 

1. If one of the right-derived numbers of a continuous function F(z) is 
non-negative in the open interval (a, b) then F(a) = F'(b). 

Hint. Each point is a point of ascent to the right. 

2. If one of the right-derived numbers is restricted to the range [«, f] in 
the interval a < x < b, then for any 2, 22, in (a, b) we have 

og Ed FY) 2 9, 
Ly —- Ly 
Hint. Apply the result of problem 1 to the function F(x) — « a. 


3. If one of the derived numbers of the function /’(z) is continuous at the 
point 2, then F’ (x) exists. 
Hint. Use problem 2. 
4. (Van der Waerdens’ example). Let 
Po (X) = | 


x for 0 
l—x for } 


DIFFERENTIATION AND INTEGRATION 291 


and let us continue this function throughout the axis with period 1. Further, 
let 


onal) =F Gold 2). 


The function ¢,,(z) has period 4~" and a derivative (everywhere except at 
angular points with abscissae p/2") equal to +1 or —1. Finally, let 


f(a) = 3 o4(2). 


Show that /(x) is continuous but lacks a derivative at every point. 

Hint. Fix m for a given x) and take the increment 4 = +1/4". The in- 
erements of all the ¢,(z), as from the m, will now vanish. The function 
@m-1() has intervals without angular points of length 2/4”; the one that 
contains x, will also contain one of the intervals (x, 7) + 1/4") or (2%, 
2% — 1/4”). But all the preceding functions y,(x), k < m — 1, have no angular 
points in this interval; their increments will be equal in modulus to the 
increment of the argument. All in all, we have 


Af(z) ™tA9,(@) _ an even number if m is even, 
Ane fo Ar | an odd number if m is odd. 


Thus A/(x)/Az has no limit as Ax > 0. 


5. At a point x, 0 $x < 1, having the dyadic resolution 0, a,, a ... a, --- 
{as usual, we exclude resolutions of period 1), we put /(x) = 0, a, 04,0 ..., 
replacing the dyadic digits at even positions by zeros. Show that f(z) is 
continuous from the right and nowhere has a right-hand derivative. 

Hint. Consider the increment of the function on changing the n™ dyadic 
digit in x from 0 to 1 or the group 01 to 10. 


2. To the usual rules for differentiating sums and products, we 
add here a theorem on the term-by-term differentiation of a series 
of monotone functions: 


TuHEoREM 2. (“‘Fubini’s little theorem’’). A series of monotone 
(non-decreasing) functions 


SF, (0) = F(a) (1) 


which converges everywhere admits term-by-term differentiation almost 
everywhere: 


oF, (z) = F' (a). 


Proof. We can suppose without loss of generality that all the 
functions F,, (x) are non-negative and vanish at x = a: in the con- 
trary event we could replace F,,(x) by F,(x) — F,,(a). 


292 MATHEMATICAL ANALYSIS 


The sum of a series of non-decreasing functions is of course a 
non-decreasing function. Let us consider the set # of full measure 


on which all the Fi,(z) and F’(x) exist. For « € EF and any & we 
have 


2 Lals) — F,(2)] _ PF) — F(a) 
&—2 &—2 , 


Since the terms on the left are non-negative, we have for any NV 


N 
ZO FM pg) — Fe) 


&é—« <= E—x 


Proceeding to the limit as § > x, we get 
N 
>» Fie) S F'(@), 
n=l 


so that, letting N tend to oo and remembering that all the F(x) 
are non-negative, we find 


foo) 
3 Pale) s F(a). 2) 
nel 
We shall show that in fact the equality sign holds her for almost 
all x. For a given k we find a partial sum S,,, (aw) of the series (1) for 
which ’ 
1 


0 < FQ) — Sy (6) < x 


(k = 1,2, ...). 


It follows, since the difference F(a) — S,,(x) = 3) F(x) is a non- 
j>n, 
decreasing function, that for all « 


1 


0 SF) —8yle) <=, 


and hence the series of non-decreasing functions 
oe] 
al (z) — 8, (&)] 


converges (uniformly, what is more) on the whole closed interval 
a<x« <b. But then by what we have proved the series of deriva- 


DIFFERENTIATION AND INTEGRATION 293 


tives converges almost everywhere. The general term F’ (x) — Si (x) 
of this series tends almost everywhere to zero, which means 
that S;,, (7) > F’ (x) almost everywhere. But if we had the < sign 
in inequality (2), no sequence of partial sums could have the 
limit #’ (x). We must therefore have the equality sign for almost 
all 2, as asserted. 

3. The decomposition of a non-decreasing function into a saltus 
function and a continuous function. Let A = {x,, x,, ...} be an 
arbitrary finite or countable subset of the closed interval [a, b] and 
let B = {h,, hg, ...} be an equivalent set of positive numbers with 
a finite sum 5’ h,. We set up a one~one correspondence between 


n 
A and B so that the point x, corresponds to the number h,, bearing 
the same index n, and define the function H(z) as the sum of all 
the h, which correspond to points x, lying not to the right of x: 


H(z) = ST In. 


Lyx 


The function H (x) so constructed is said to be a saltus function. 
Since it is non-decreasing, it has an at most countable set of dis- 
continuities. We shall show that it is continuous to the right that 
all its discontinuities lie at the points %, 2, .-., %, ... and that 
the corresponding salti H(x,) — H (x, — 0) are equal to just the 
numbers h,,. For a given x = x), we have 


A(x +0) = lim Dd) hn = Db! In = H(%), 


@>Xy+ Van Su LyX 


H(t, —0)= lim SY hy = 3! In. 


C>2y)-0 tySr Ly Lo 


If x does not coincide with one of the z,,, then H (x, + 0) = H (2p) 
= H(x) — 0) and H (x) is continuous at x. If x) coincides with 
one of the 2,, then the difference between H (a + 0) = H (xq) 
and H (a — 0), i.e. the saltus at z,, is equal to h,. Our assertion is 
therefore proved. By Fubini’s little theorem the function H (q), as 
the sum of a convergent series of non-decreasing functions 
H,,(«) -| 0 for ~<a, 
h, for % =p, 
the derivatives of which vanish almost everywhere, also has a 
derivative which vanishes almost everywhere. 


294 MATHEMATICAL ANALYSIS 


THEOREM. Any non-decreasing function F(x), continuous to the 
right, can be represented as the sum of two non decreasing functions: 


F(x) = W(x) + G(x), 


where H (x) is a saltus function and G(x) a continuous function. 

Proof. Let x,, x, ... be all the discontinuities of the function 
F(x) and h,, h,, ... the corresponding salti. We construct a saltus 
function by means of the formula 


Ly Su 


We shall show that the difference G(x) = F(x) — H(z) is conti- 
nuous non-decreasing. For a’ < a'’ we have 


G(a!) — G(a') = [F(@") ~ F(a’)] — [A (@") — H(e')), 


where the difference on the right is non-negative [it represents the 
measure of the range of F(a) over the set of points of continuity of 
the interval x’,x’’]. Further, at each point x there exist G(x + 0) 
= G(x) and G(« — 0), and 


G(x + 0) — E(w ~ 0) 
= [F(x +0) — F(x — 0)] — [Hw + 0) — H(w — 0)]; 


in virtue of the properties of the function H (x) indicated above, 
this difference vanishes for all x, so that the function G(x) is conti- 
nuous. The theorem is proved. 

Note. The condition of continuity to the right, which figures in 
all our constructions, can be discarded if we generalize the concept 
of a saltus function suitably. For instance let points x, and non- 
negative numbers h,,, h',, be given; the function defined by the for- 
mula 

H(a)= Shy + 3 In, 
Ly <x Ee 
has at each point wz, a saltus h, to the left (so that H(x,) — 
— H(«, — 0) =h,), a saltus hj, to the right (so that H(x, + 0) — 
— H(x,) = h,),andatotalsaltush, + h,.If F(x) isanon-decreasing 
function which has salti h,, to the left and h;, to the right at x, then 
subtracting M(x) we get a continuous non-decreasing difference 


G(x) = F(x) — H(z). 


DIFFERENTIATION AND INTEGRATION 295 


2. Functions OF BOUNDED VARIATION 


1. Starting with non-decreasing functions, we can construct 
a wider class of functions which have a derivative almost every- 
where. Together with non-decreasing functions their differences 
evidently have a derivative almost everywhere. We shall give an 
intrinsic definition of functions which are differences of non-de- 
creasing functions. We start with the following definition: 

Definition: A function F(x), defined on the closed interval 
a <x <6, is said to be a function of bounded variation if for any 
partition of the interval 


A=Xy<%<-+ <4, =| 
the sum 


nt 


1 
|F@ea1) — F (tx) | (1) 
K=O 


remains bounded above by a fixed constant. 

Every non-decreasing function F(x) is of bounded variation 
since the sum (1) is then independent of the partition and is equal 
to F(b) — F(a). The sum and difference of two functions of 
bounded variation are obviously themselves of bounded variation. 
In particular, the difference of two non-decreasing functions is of 
bounded variation. We shall now see that the converse is true; 
every function of bounded variation can be represented as the 
difference of two non-decreasing functions. 

Let F(x) be a function of bounded variation; we shall call the 
exact upper bound of the sums (I) under all possible partitions 
of the integral (a, b] the total variation of F(x) on [a, b] and denote 
it by V2[F]. We shall show that fora <¢ <b 


Va LF] = Va [F] + Ve [F). (2) 


If c is one of the partitioning points of [a, b], so that x,, = c, say, 
then 


n-1 
2) Fai) — Fee) | 
k=0 


m-1 


n-1 
= DS) | FP (ti41) — F(a)! + 2 |F (@a1) — F(ax)|. (3) 


k=0 


296 MATHEMATICAL ANALYSIS 


The sums on the right can be made arbitrarily close to V¢[F] + 
+ V°(F] by a sufficient refinement of the subpartitions. We can 
therefore assert that 


n-1 
Va [F] = sup 21 +3) — F(a,)| 2 Va LF] + V2 [FI]. (4) 


On the other hand the addition of a further partitioning point c 
to an arbitrary partition of the interval can only increase the sum 
(1). Hence for any partition, whether it contains the point ¢ or not, 
we have in virtue of (3): 


n-1 
2 IF ear) — F(x,)| S VaLF] + VeLF). 
Taking the upper bound on the left-hand side, we get 
VeLF] S ValF] + VELFI. (5) 


The result (2) follows, as required, on combining (8) and (5). In 
particular, V(x) = V2[F'] is a non-decreasing function. The diffe- 
rence G(x) = V(x) — F(a) is also a non-decreasing function, since 
evidently for a’ <a’ we have V(x") — V(a') = V2’ [F] 
= F(x'') — F(x’) and therefore 


V(e") — F(x") & Vie’) — F(z’). 


Thus a function of bounded variation can be represented as the 
difference of two non-decreasing functions 


F(a) = V(x) — G(a). 


2. Many additional properties which may be possessed by the 
function F(x) carry over to its non-decreasing components V (x), 
G(x). Such a property, for example, is continuity, two-sided or 
one-sided. We shall show that if F(x) is continuous, say to the 
right, at x = 2, then so also are V(x) and G(x). It is enough to 
show this for V(x). Since F(x) is continuous to the right at a», 
we can find for a given e > 0 a corresponding 6 > 0, such that for 
any 2%, > %» which exceeds 2, by less than 6 we have 


[F(q) — Flm)l <>. 


DIFFERENTIATION AND INTEGRATION 297 


We construct a partition 1 <a, < -- <2, =b of the closed 
interval [z,, b] such that 


n-1 
SF (a1) — F(m)| > V2, UF) — 5 
k=0 


The point x, here can always be required to satisfy the condition 
Xy <2, <x + 6. We get 


n-1 
VEEF] < SF Gis) — Fla + > 
k=0 


n 


< 


[F (tesa) — F(x)| + ¢ < Vi,[F] + 


=1 


> 


and therefore 
V(a,) — V(a) = VE [LF] = V2, (F]— V2, LF] <e;, 


from which it follows that V(x) is continuous to the right at 
% = wp. Similarly V(x) can be proved continuous to the left if the 
original function F(z) is supposed continuous to the left. We 
therefore get: if a function F(x) of bounded variation is continuous 
on the closed interval [a, b], then so is the function V(x) = VG[F] 
and likewise G(x) = V(x) — F(z). 

Conversely many properties of the function F(x) can be foreseen 
from the properties of its non-decreasing components. Thus by 
Lebesgue’s theorem (Section 1) every non-decreasing function has 
a derivative almost everywhere. Hence any function of bounded 
variation has a derivative almost everywhere, since it is the difference 
of two non-decreasing functions. 


Problems. 1. Show that the product of two functions F,(2), F(x) of 
bounded variation is again a function of bounded variation, with 


BF y ¥F,) $ max| F(x) | Vale] + max | Fe(x)|V [F,]. 


2. Let F(x) =a > 0 bea function of bounded variation. Show that 1/F (x) 
is also a function of bounded variation, with 


v3 [rer << VilFl. 


3. A curve y = F(x) (a Sx Sb) is said to be rectifiable if the length of 
the polygonal lines with successive vertices at the points (x,, F(2,)), ... 
w. (%_, F(x,)), where a = 2, < %,< --- <x, = b, is bounded by a fixed con- 


298 MATHEMATICAL ANALYSIS 


stant which is independent of the number » and the choice of points 2, ..., %,-1. 
Show that the curve y = F(2) is rectifiable if and only if the function F(z) 
is of bounded variation. 

Hint. Use the inequality 

|Ayj| < Via, + |4y|? <|42;| + |4y4|- 
1 

4, Prove that the continuous function a sin ye on the closed interval 
[0, 1] is of bounded variation for « > 6 but not for « = f. 

5. Does there exist a continuous function F(x) which is not of bounded 
variation over any interval? 

Answer. For instance, a function such as Weierstrass’ function, which has 
no derivative anywhere. 


6. Define a norm 


\|F ll = V3lF] 


on the space of all functions F(z) of bounded variation on the closed interval 
[a, b], (functions which differ by a constant are considered equivalent). Show 
that a complete normed space results. 
7. Put V(x) = Vx[F], where F(x) is a function of bounded variation; show 
that almost everywhere 
= |F'(z)|. 


Hint. For a even n construct a partition a = % <%,<+-< 2, = b such 
that the sum Bi, |F(2,41) — F(x,)| differs from 


Put F(x) = iF lz (x) + C,,, on the interval («,, 24 1), choosing the signs + 

and the constants C,, so that P(t, 41) — Fy (%,) = |F (41) — F(x,)|. Show 

that V(x) ~— F(x) is non-decreasing, and apply Fubini’s little theorem to the 
fee] 


convergent series 2” [V(x) — F,(x)]. 
n=l 
8. Show that the functions F(x), V(x) have the same points of disconti- 
nuity and that their salti at these points coincide, disregarding sign. 
3. Let us consider the indefinite integral of a summable function 
p (2): 


= foo dé 


The function F(x) is of bounded variation since 


n-1 me aptl 


S| F (tes) —F l= | f p(é) dé | 


n-1 cia 


b 
s 5 J l@laé = fle @ldé. 


0, 


DIFFERENTIATION AND INTEGRATION 299 


(Alternatively F(x) can be represented immediately as the differ- 
ence of two non-decreasing functions, using the decomposition of 
(x) into positive and negative parts). By what we have proved, 
the function F(x) has a derivative F' (x) almost everywhere. 

We shall show that this derivative coincides almost everywhere 
with the original function g(x). It is sufficient if we restrict our- 
selves to a summable function (x), in the class C+ (Chapter IV, 
Section 2). We have g(x) = lim h,(x), where h,,(x) is a step func- 


nm—->co 
tion and h,(x) Shy, (x) (n = 1,2, ...). For integrals of the func- 
tions h,,(x) our assertion is immediately evident: if 


H,,() = f lin() dé, 


then H/, (x) = h, (x) everywhere save at the discontinuities of h, (x). 
Since the monotone increasing sequence h,,(~) converges to ¢ (zx), 
the sequence H,,(z) has the function F(x) as its limit for every zx. 
Moreover F(x) can be written in the form of a series of non-de- 
creasing functions 


F(e) = Hy(a) + 3 Hnex(0) ~ Hn). 


Applying Fubini’s little theorem, we get 
F'(x) = hy (x) + a [hn+1(%) — hy (x)] = p(@), 


almost everywhere, the result required. 
We have thus solved problem 1, proposed in the introduction 
to this chapter. We formulate the result in the form of a theorem: 


THEOREM 1. (A. Lebesgue). If p(x) is a summable function, tts 
indefinite integral 


xz 


F(x) = { p(é)dé 


a 


is a continuous function of bounded variation and it has almost 
everywhere a derivative equal to p(x). 


300 MATHEMATICAL ANALYSIS 


4, Lebesgue Points 


We shall say that a point x, € [a, b] is a Lebesgue point of the 
summable function (x) if 


lim } 


22, © — = [lee = YP (%p)| dé=0 (1) 


Xe 


If the function (x) is continuous, it is easily seen that any point 
Xo E [a, 6] is a Lebesgue point of m(x). We shall show that in the 
general case when ¢ (x) is summable, almost every point of the inter- 
val [a, b] is a Lebesgue point p(x). 

Let 7 be a fixed number. Then by theorem 1 the limiting relation 


1 

lim ——— | lp(@) — 7] 48 = |p (es) — 11 

holds on some set £#, of full measure. We consider a countable 
everywhere dense set of values of r, say the set of rational r. The 
intersection £ of all the #, is also a set of full measure. Let x € # 
be such that (x) is finite; we show that a, is a Lebesgue point. 
For, given ¢ > 0, we can find r such that |p(x) — 7r| < é/3; 
further, we can write: 


1 x 
woe! — p(%)| dé 


1 1 7 
= Team ee) rag 4 cari ple)iag 


= ara! ee — 7] dé — |p (xo) — 1] ¢ + 21 y (%) — v1. 


For sufficiently small |~ — 2 | the expression in curly brackets 
becomes smaller than ¢/3; hence for such |# — xo|-the total sum 
on the right-hand side will be less than ¢, and the required equa- 
tion (1) follows. 

A summable function (7) has many of the properties of a conti- 
nuous function at its Lebesgue points. 


DIFFERENTIATION AND INTEGRATION 301 


We shall subsequently require a property of y(x) which is ex- 
pressed in the following lemma: 


Lema. If 29 1s a Lebesgue point of the function y(x) and E,, is a 
sequence of measurable sets which contract to the point x», in the 
sense that H,, is disposed along an interval A, which contains x) and 
is of length 6, > 0, and are such that wu E, =x 6, (where « > 0 is 
a fixed constant), then 


lim 


1 
a [ 716) 48 = pm). (2) 
En 


Proof. We have, clearly, 


ote ~~ [owas 
En, 
1 1] 
=| sar [eo - onde] =F f neo - o@iae 
En En 


-1 
= 27, | lle) — pte) ag. 
An 


If x is the end-point of an interval A,,, this expression has the limit 0 
by hypothesis (2 is a Lebesgue point). If x) is an interior point 
of A, = (on; Bn), the ratio obtained lies between 


— ae five — p(G)| 48, and —— = Pinto =o (ae 


and therefore tends to zero, together with them, as required. 
As a corollary, we get that at each Lebesgue point the function ¢ (x) 
ts equal to the derivative of its indefinite integral. 


Problems. 1. A point x is said to be a density point of a measurable set # 
if 
lim HOM?) HA) 
A>reE EF 


where E(A) denotes the part of # which is contained in the interval 4. Show 
that almost all points of a set H are density points of that set. 


=I, 


+ By a well-known arithmetical inequality the fraction (a + c)/(b + d) 
lies between a/b and ¢/d. 


302 MATHEMATICAL ANALYSIS 


Hint. Apply the lemma to the characteristic function of #, putting Z, = 4, 
(the interval which contracts to 2). 


2. Given that for any interval A, wp H(A) =a pA, where « > 0 is fixed, 
show that # has full measure. 
Hint. The complementary set has no density points. 


CoroLiary. There exists no measurable set such that the measure of the 
part of it falling in an arbitrary interval is precisely equal to half the length 
of this interval. 


3. A point € is said to be a point of asymptotic continuity of a function F(z), 
measurable over [a, b], if there exists a measurable set H on which F(z) is 
continuous and for which é is a density point. Prove that almost every point 
€ € [a, 6] is a point of asymptotic continuity of F(x). 

Hint. Every density point of a set Ey, u Hs <b — a—e, on which F(x) 
is continuous (Chapter IV, Section 4, art. 7) satisfies the condition. 


4. Verify that a summable function F(x) is asymptotically continuous at 
every Lebesgue point; if #'(x”) is bounded, the converse is also true. 


5. Construct an example of a summable function which has a Lebesgue 
point at which (2) is not satisfied if for the Z, are chosen certain sets for 
which pw H,/u A, > 0. 


3, DETERMINATION OF A FUNCTION FROM ITS DERIVATIVE 


We proceed now to the solution of problem II of the introduction 
to the present chapter. 

1. A function F(x) is given on the closed interval a Sx <b, 
having almost everywhere a derivative F’ (x) = y(x). 

It is asked: is the function g(x) summable and does the formula 


Pw) = fede +e (1) 


hold? 


As we know, a necessary condition for (1) to hold is that F(z) 
should be of bounded variation. We shall assume that this condi- 
tion is satisfied and initially we shall suppose that F(x) is non- 
decreasing. We begin by answering the first question: we shall 
show that the derivative of a non-decreasing function is always 
summable. The derivative of the function F(x) is the limit of the 
ratio 


_ F@ +h) — F(@) 


®@,, (x) h 


DIFFERENTIATION AND INTEGRATION 303 


The functions ©, (x) are non-negative and as h -> 0 converge almost 
everywhere on the interval [a, 5] to the limit F’ (x)}. To prove F’ (x) 
summable we apply Fatou’s theorem, Chapter IV, Section 3; this 
guarantees that F’(x) is summable if the integrals of the ®, (x) 
over [a, b] remain bounded. Assuming «, f to be points of continuity 
for all the iad x), we have: 


focoas -= ae F(a) dz jf re 


Res 
Bh 


=m) te) taf f roar 


This quantity has the limit F(8) — F(x) as h + 0 and is therefore 
bounded. In consequence we are able to apply Fatou’s theorem, 
and we oe in addition to the summability of F’ (a) the inequality 


[re )da < F(f) — F(a) < F(b — 0) — F(a + 0). 
[Pre v)dx = F(b — 0) — F(a +0). 
In particular, if a, 6 are points of continuity of F(x), then 
fre da < F(b) — F(a). (2) 


The < sign can in fact obtain here, for example in the case of 
a step function F(x) the derivative of which vanishes almost 
everywhere. 

2. Indeed it turns out that the < sign in inequality (2) of art. 1 
can arise in practice even for a continuous non-decreasing function 
F(x). 

As an example let us consider some closed nowhere dense set Z 
(e.g. the Cantor set). 

We showed previously (Chapter II, Section 4) that such a set 
has the power of the continuum. We recall the construction. We 
began by establishing a one-one correspondence between the set 


+ Letting « > a, 8 > b, we get that F’(x) is summable over [a, 5] and a. 
If x + h goes outside (a, 6] we continue F(x) as a constant. 


304 MATHEMATICAL ANALYSIS 


of all contiguous open intervals of Z and the set of dyadic rationals 
of the interval [0,1] with order preserved, i.e. such that if the 
interval A’ lies to the left of A”, then the corresponding dyadic 
rationals r’, r’’ are connected by the inequality r’ <r’. We then 
extended it, on the one hand to all points of Z of the second type, 
on the other to all the dyadic irrationals of (0, 1], still preserving 
order. This correspondence determines a non-decreasing function 
F(z) of a variable x which runs over the whole closed interval 
[a, 6] along which the set Z is disposed; the function F'(7) varies 
between 0 and 1; it is constant on the contiguous open intervals 
of Z, taking on the corresponding dyadic rational value, while at 
points of Z of the second type, it assumes the corresponding dyadic 
irrational value. Since it is non-decreasing and assumes all values 
in the interval [0, 1], it is continuous. 

Its derivative exists in every case and vanishes at all points 
of contiguous open intervals; hence if the set Z is of measure 
zero, the derivative of F(x) vanishes almost everywhere and the 
< sign holds in (2). 

3. Thus we have to impose stronger conditions than continuity 
on the non-decreasing function F(x) to ensure equality in (2) of 
art. 1. 

Definition. A function F(x), defined on the closed interval [a, 6], 
is said to be absolutely continuous if for any ¢ > 0 we can find 
6 > 0 such that, for any (finite) system of non-overlapping inter- 
vals (a1, 01), «-, (@n, bn) of overall length 


dD) (b, — a) <4, 
k=1 


the corresponding sum of absolute increments of the function 
does not exceed ¢: 


44x 


|F (b;,) — F(a,)| <e. (I) 


fay 


k=1 

For example every function that satisfies the Lipschitz condition 
|F(e!) — P(a')| < Cla” —2'|, 

is absolutely continuous, since for any system of intervals 


(a,, 51), -- (Gn Bn) 


Y |F,) — F(a)| <6 Sb — an), 
k=l k=1 


DIFFERENTIATION AND INTEGRATION 305 


and to ensure that the sum on the left is less than a given ¢ > 0, 
it is only necessary that the overall length of the system chosen 
should not exceed 6 < ¢/C. 

On the other hand the continuous non-decreasing function F (x) 
of art. 2 with a derivative vanishing almost everywhere is not ab- 
solutely continuous. For the set Z can be covered with a countable 
system of non-overlapping intervals of arbitrarily small overall 
length, on which the sum of the increments of F(x) is equal to 1; 
on a sufficiently large finite subsystem this sum will exceed 1/2, 
which is incompatible with the definition of absolute continuity. 

A function F(x) which ts the indefinite integral of a summable 
function p(x), so that 


F(x) = f p(é) a6, 


zs always absolutely continuous. 
For we have: 


FO) — Fla)| = 5 
kel k 


n %& 
<X fle@lde= f lew@ldé, 
e = (apy bx) 

and in virtue of the absolute continuity of the integral over the 
set (cf. p. 177), the result tends to zero together with the measure 
of the system of intervals (a; b,). 

We observe a few simple properties of absolutely continuous 
functions. 

As a matter of fact, instead of inequality (1) in the definition of 
absolutely continuous function, we can write 


2 FG) — Fadl] <e, 2 


which seems rather surprising at first sight. But suppose (2) is 
satisfied for any system of intervals of overall length < 6. Having 
fixed such a system, we distinguish two subsystems in it, so that 
the increments of F(x) are positive in intervals of the first sub- 
system, and negative in intervals of the second. On writing (2) 
for each of these subsystems and adding, we in fact obtain (1), 
with the unimportant replacement of ¢ by 2e. 


306 MATHEMATICAL ANALYSIS 


Hvery absolutely continuous function F(x) is of bounded variation. 
For let 6 > 0 correspond to a given ¢ > 0 in the condition for 
absolute continuity of F(x). Then on any interval of length <6 
the variation of F(x) is bounded and does not exceed ¢. Hence 
on the whole interval [@, b], which we can represent as the union 
of a finite number of closed intervals of length < 6, the variation 
of F(x) is again finite. 

An absolutely continuous function, like any function of bounded 
variation, can be represented as the difference of two non-decreasing 
functions 


F(e) = V(e) — Q(x), Vw) = VELFI. (3) 


We claim that the minuend and subtrahend in the decomposition 
(3) are also absolutely continuous. It is sufficient to confirm this 
just for the minuend V (x). 

Let 6 > 0 corresponding to a given ¢ > 0 in the condition for 
absolute continuity of F(x) be found and let us consider a system 
of intervals (a,, 0,), --, (@n5 bn) of overall length < 6. The sum 


D! {V (bx) — Via} = 3 Vek (FI. (4) 
kel kel 
is the exact upper bound of the quantities 
n om 
Ds P(e 541) — Fees) |, (5) 
kealj=l 


where @), = Xpq << %yey <0 < Lenk = 0, is an arbitrary subparti- 
tion of the interval (a;,, ,). Since the sum of the lengths of the 
subpartition intervals for (a), b;.) is equal to the length of the inter- 
val (a;,,, 6,) itself and the total sum of the lengths of the intervals 
(a, b,) is less than 6, none of the expressions (5) can exceed e in 
virtue of the absolute continuity of F(x); the required absolute 
continuity of V (x) follows. 

Like every function of bounded variation, an absolutely con- 
tinuous function has derivative almost everywhere. We shall 
require the following lemma: 


Lemma: If an absolutely continuous non-decreasing function F(x) 
has a derivative which vanishes almost everywhere, then F(x) is 
constant. 


DIFFERENTIATION AND INTEGRATION 307 


Proof. The domain of variation of F(x) is the closed interval 
S = [F(a), F(6)]; to prove the lemma, we shall show that it is of 
measure zero. We denote by Z the set (of measure zero) of points x 
at which the derivative is either non-existent or non-zero, and by 
E the set (of full measure) on which the derivative exists and 
vanishes. The set Z is mapped by the function F(x) onto some 
set F(Z), and the set # onto F(Z); clearly S = F(L) + F(Z). 
Given e > 0, we find 6 > 0 in the condition for absolute continuity 
of F(x) and cover Z with a (possibly countable) system of non- 
overlapping intervals of overall length < dt; then the set F(Z) 
will be covered by a system of intervals of overall length Se. 
Since « is arbitrarily small, the measure of F(Z) is equal to zero. 
Further, since the derivative of F(x) vanishes at each point x € H, 
we can find for each such point a point € > x for which 


F(é) — F(x) Se(§ — 2), 
or, what is the same thing, 
e&—F(E) ex — F(z). 


Thus x is a point of ascent to the right for the function G(x) = ex 
— F(z) (Section 1). By Riesz’ lemma the set of points of ascent 
to the right is open and for its component open intervals (a;,, 0;) 
we have 

a, — F(ay) S & by — F (by) 
or 


F (by) — Fay) S €(0, — ao), 


and therefore 
Ot LP (bx) — F(ax)] S & 3) (Oe — %) S e(b — a). 

Thus the set F(Z) is covered by the system of intervals (F(a;,), 
F(6,)), which have arbitrarily small overall length. It follows that 
F(E) is of measure zero. Hence the closed interval S = F(Z) 
+ F(Z) is also of measure zero, i.e. it reduces to a single point, 
as required. 

Note. Analysing the proof given, we observe that it incorporates 
some more general results, viz: 


+ A system of non-overlapping intervals can be obtained from an arbi- 
trary covering 4, + 4,+ +-DZ by discarding from each A, any points 
that belong to 4; + --- + 4,_. 


308 MATHEMATICAL ANALYSIS 


(a) if F(x) is an absolutely continuous non-decreasing function, 
then the image F(Z) of any set Z of measure zero is also a set of 
measure zero. 

(b) if the function F(x) is non-decreasing and its derivative 
vanishes on a set #, then the image F(Z) of # is a set of measure 
zero. 

We can now prove the fundamental theorem of this paragraph, 
which yields the answer to problem IT. 


THEOREM (A. Lebesgue). The derivative p(x) of an absolutely con- 
tinuous function F(x) defined on the closed interval [a,b] is sum- 
mable, and for every x 


fee) dz = F(x) — F(a). 


Proof. An absolutely continuous function F (x) can be represented 
as the difference of two absolutely continuous non-decreasing 
functions; we can therefore assume without loss of generality that 
the function F(x) is non-decreasing. Let g(x) be its derivative; 
by what has been proved, g(x) is summable. We put 


G(x) = fp(é) dé. 


The function G(x) is absolutely continuous and, as we saw in 
Section 2, its derivative coincides almost everywhere with g(x). 
But the derivative of the absolutely continuous function F(x) also 
coincides with w(x); hence the derivative of the difference H (x) 
= F(x) — G(x) vanishes almost everywhere. We observe that the 
function H (x) is also non-decreasing, since in virtue of inequality 
(2) of art. 1 


B 
H (8) — H(«) = F(8) — F(a) — f p(a)dz 20. 


Hence in accordance with the lemma H (x) is constant, equal to 
C, say. But then we have: 


F(x) = Gz) + He) = fe@ +. 


Putting x = a, we get Cy = F(a), and the theorem is proved. 


DIFFERENTIATION AND INTEGRATION 309 


Let us consider a non-decreasing continuous, but not absolutely 
continuous function F(x). Repeating the foregoing argument, we 
find that the difference 


Z (x) = F(x) — G(x) 


is a continuous non-decreasing function and that its derivative 
vanishes almost everywhere. Since Z(x), together with F(x), is 
not absolutely continuous, it is not constant. We get a represen- 
tation of the continuous non-decreasing function F(x) in the form 
of a sum of two continuous non-decreasing functions 


F(a) = G(x) + Za), 


of which the first is absolutely continuous and the second has a 
derivative which vanishes almost everywhere. 

For an arbitrary non-decreasing function (possibly discontinuous) 
this decomposition is completed in accordance with the last article 
of Section 1 by the addition of one further term—a saltus function 
AT (x). 

F(x) = G(x) + Z(x) + A(a). 


From non-decreasing functions it is easy to pass to functions of 
bounded variation. Equation (6) holds good for a continuous func- 
tion F(x) of bounded variation; G(x) is then an absolutely con- 
tinuous function, and Z(x) a so-called singular function, ie. a 
continuous function of bounded variation the derivative of which 
vanishes almost everywhere. 

We observe incidentally that the decomposition of a given con- 
tinuous function F(a) into absolutely continuous and singular 
components is unique up to an additive constant. For let 


F(x) = G(x) + 2(x) = G,(@) + 4 (), 


where G,G, are absolutely continuous and Z, Z, are singular; 
then 


CAG 2727 


and the function G-G,, which is absolutely continuous, has to- 
gether with the right-hand side a derivative which vanishes almost 
everywhere and must therefore be constant. 


MA. I1 


310 MATHEMATICAL ANALYSIS 


4. Integration by Parts 


Let g(x), p(x) be summable functions and O(x), W(x) their 
indefinite integrals. We then have the formula 


b b 
J P(e) pede + fe) pe) de = Gb) ¥(b) — Ba) (a). (1) 


For the proof it is sufficient to observe that the function ® (x) ¥/(2), 
together with @(w) and W(x), is absolutely continuous and its 
derivative is given by the usual formula 


(P(x) P(x))' = P(x) p(x) + p(x) P(2), 
from which (1) follows on integration between the limits a, D. 


Problems. 1. Construct a continuous function, the derivative of which 
exists everywhere but is not summable. 

Answer. For example, y = x? sin 1/2?. 

2. It is given that a function F(x) satisfies the following condition: for 
any € > 0 there exists 6d > 0 such that the inequality 3’ b, — a,) < 6 implies 
the inequality >’ |F'(b,) — F(a,)| <e (ie. in the condition for absolute 
continuity the requirement that the intervals (a,, 6,) be non-overlapping is 
dropped). Show that F(x) satisfies the Lipschitz condition 


| F(x") — F(a')|< Cla" —2'|. 


3. Prove the following converse of the note on the lemma (p. 308): if the 
function (x) is non-decreasing and the image F(Z) of the set Z on which 
the derivative of # (x) is non-existent of infinite is of measure zero, then F(x) 
is absolutely continuous. 

4. If F(x) is absolutely continuous, then for p= 1, but not necessarily 
for p < I, |#(a)|? is also absolutely continuous. 


5. Prove that the total variation of the function F(x) = f (&) dé, where 


a 
g(x) is a summable function, over the closed interval [a, 6] is equal to 
5 


f [| dé. 


Hint. A corollary to problem 7, art. 2, Section 2. An independent con- 
struction: approximate (x) in the metric of ZL, by step functions. 

6. Show that absolutely continuous functions form a closed subspace in 
the space of functions of bounded variation (see problem 6, art. 2, Section 2). 


4, FUNCTIONS oF SEVERAL VARIABLES 


1. We now undertake the task of generalising the results of 
Sections 1-3, which relate to the case of a single variable, to the 
case of several variables. For simplicity we shall consider the case 
of two variables. 


DIFFERENTIATION AND INTEGRATION 311 


We begin by formulating the facts familiar to us from classical 
analysis. If (x, y) is a continuous function, we can form its inte- 
gral over any (closed) region G: 


BG) =f f o(é.) dé dy. 


G 
This naturally replaces the indefinite integral f p(€) dé which 


we considered when we were dealing with a single variable. 

The function (x, y) of a point can be obtained from the function 
@(G) of a region by means of a limiting operation in place of 
differentiation 


. 1 
gla, y) = lim yar} {een dédy, 
GC 


G-> (x, y} 


where |G| denotes the area of the region G, and G — (x, y) denotes 
that G@ shrinks to the point (#, y). If now (&, 7) is an arbitrary 
summable function defined on the rectangle 


A= {a, 54 Sb,,a Sy 33|,}, 


we can take any measurable set & in place of the region G. The 
following problems naturaily arise: 
Problem WI. Does the limiting relation 


1 | 
g(x,y) = lim sia] [menage 


Eo y 


hold everywhere in the region A if p(x, y) is an arbitrary summable 
function? 
The converse problem can be as follows: 
Problem IV: Let a function ®(€) of a measurable set be given; 
when can we assert that tts “density” 
. DE) 
(z,y)= lim ——, 
" Ermy) ME 
exists (if only almost everywhere) and under what conditions is the 
function D(&) recoverable from its density in the form of an integral 


P(E) = | f pm) dE dy? 
& 


11* 


312 MATHEMATICAL ANALYSIS 


We shall reduce the solution of both problems to the solution 
of the analogous problems for a single variable, which were con- 
sidered in Sections 1-3, by Riesz’ so-called principle of transition. 

2. TuEroreM 1 (F. Riesz’ principle of transition). It is possible 
to establish between the set of points of the plane and the set of points of 
the line an “almost” one-one correspondence such that measurable sub- 
sets correspond to measurable subsets and measure ts preserved. 

Proof. We partition the €-axis into intervals (m,m + 1) of unit 
length and the «x y-plane into a square grill of unit mesh with 
vertices at the integral points, then set up an arbitrary one-one 
correspondence between the intervals of the axis and the squares 
of the plane; the map is possible since each set is countable. We 
further divide each square into four equal squares of area 1/4 and 
the corresponding interval into four equal intervals of length 1/4, 
and once more map each of the smaller squares arbitrarily onto 
one of the smaller intervals to give a one-one correspondence. We 
continue this construction indefinitely, taking care at each stcp 
that intervals obtained by quadrisection of an antecedent interval 
A correspond to squares obtained by quadrisection of the square 
corresponding to A. We shall call the aggregate of all the squares 
obtained in this way a plane net and the aggregate of all the inter- 
vals on the line a linear net. 

By construction every sequence of nested squares of a plane 
net is mapped onto a sequence of nested intervals of the correspond- 
ing linear net, and conversely. We extend this correspondence as 
follows into a correspondence between the points of the plane and 
the points of the line. 

For definiteness we shall consider only “proper” sequences of 
squares and intervals, i.e. those in which the mth term has measure 


1 
{area or length) exactly rx 


If both coordinates of a point of the plane are dy-dic-irrational, 
there exists a unique proper sequence or squares of the plane nes 
all of which contain it. The corresponding sequence of intervalt 
on the line determines uniquely a point on the line, which we make 
the image of the chosen point in the plane. If the point in the plane 
has one coordinate (or both) dyadic-rational, there exist two (or 
four) proper sequences of squares of the plane net which contain it; 
we can therefore map it onto two (or four) points on the line. 
Since the set A of all points of the plane with at least one rational 


DIFFERENTIATION AND INTEGRATION 313 


coordinate is of measure zero and can therefore be covered by a 
system of squares of the net of arbitrarily small total area, the 
set of corresponding doubletons and quadrupletons on the line can 
be covered by a system of intervals of the same total length and 
therefore has linear measure zero. (The set of quadrupletons is 
evidently at most countable). Conversely, if the coordinate of a 
point on the line is dyadic—irrational, it can be mapped onto a 
uniquely determined point of the plane, using the unique proper 
sequence of intervals containing it. And if its coordinate is dyadic— 
rational, it can be mapped in general onto a pair of points of the 
plane; but the set of all points of the plane included in such pairs 
is at most countable. 

We can thus establish a one-one correspondence between two 
sets of full measure: the set of points of the plane with both 
coordinates dyadic-irrational excepting those included in the pairs 
indicated, and the set- of dyadic-irrational points of the line, ex- 
cepting the zero-measure set of points included in the specified 
doubletons and quadrupletons. We shall show that under this 
correspondence every measurable set, say on the interval [0, 1], is 
mapped onto a measurable set of equal measure in the corresponding 
square. We begin by observing that every open set on the line 
can be expressed in the form of a union of some (countable) set 
of intervals of the net without common interior points. Let H be 
an arbitrary measurable set on the closed interval [0, 1]. There 
exist sequences 0,, 0;,(” = 1, 2, ...) of open sets with the property 


0, 2H, O, DCE, 
1sp0,+ pO, Sl +e, &&>9. 
Let & be the plane image of the set H. Considering the aggregate 
of sets G,,, Gj), in the plane, composed of the squares which corre- 


spond to dyadic-rational intervals contained in O, , O},, we get the 
relations 


G,ré€, GiwCeé, 
1sypG,+pG, slte, 20. 
In follows from these relations that & is measurable (Chapter IV, 
Section 4, art. 5) and has measure 
BE =limpG, =limpd, =e, 
as required. 


314 MATHEMATICAL ANALYSIS 


We can further extend the correspondence to measurable fune- 
tions on the line and on the plane, mapping a function f(é) onto 
the function g(x, y) for which y(x, y) = f(€) if (x, y) is the ae 
of &. The functions /(&), p(x, y) are then he only measurable, but 
even ‘‘equi-measurable”, in the sense that for any C' the sets 
{E: F(E) < Ch, {(x, y): w(x, y) < C} are of equal measure. It follows 
further that the integrals of the functions f(£), p(x. y) over corre- 
sponding measurable sets H, & are equal. 

3. We are now in a position to solve problem III. We shall say 
that a sequence &, of plane measurable sets converges properly to 
the point P = (a, y) if there exists a sequence of squares Q,, of the 
plane net, containing P, and such that wQ, >0, &,<@Q,, 
“LE, =x pwQ,, where x > 0 is a fixed constant. 


TuEoreM 2. If p(x, y) is summable over a rectangle A, then for 
almost all points of this rectangle 


x,y) = lim 1 
POD Sen zg f een aed, (1) 
Hen 
én 


for any sequence of measurable sets &,, that converges properly to the 
point (x, y). 

Proof. By the principle of transition the summable function 
g(x. y) maps onto a summable function /(é), the point (x, y) in 
the plane onto a point & on the line, the sequence of squares Q, 
onto a sequence of intervals A,, which contain & and the sequence 
of sets &, onto a sequence of measurable sets £,c A, with 
BE, 20 uw A,; furthermore 


wand J v,y)dudy = 


By the lemma of nee 2 (art. 4) the last expression converges 
to f(x) almost ea. to be precise, at all the Lebesgue points 
of the function /(€). Reverting to the function g(x, y), we get that 
the limiting relation (1) holds almost everywhere in the rectangle 
A, as required. 

4. We proceed now to the solution of problem IV. 

We are given a function @(&) of a measurable set &; it is asked: 
when can we assert: that ®(&) is expressible in the form 


Ds) = f [ p(é.n) dé dn, (1) 
é 


&) dé. 


DIFFERENTIATION AND INTEGRATION 315 


where (rv, y) is a summable function ? 

We shall begin by giving necessary conditions for such a repre- 
sentation to be possible. If @(&) has a representation in the form (1), 
it obviously possesses the following properties: 

(a) additivity: O(@, + &,) = O(&,) + O(&,) if the measurable 
sets &,, &, are disjoint; 

(b) absolute continuity: given ¢ > 0, we can find 6 > 0 such 
that |@(é)| < « whenever uw & < 6 (ef. p. 177). 

We shall show that if these properties of the function ®(é) ob- 
tain, even if not for all measurable sets, then for those composed of 
squares of the grill, we are assured that @(é&) is expressible in the 
form (1). 

By the principle of transition, we can map the function B(é&) 
onto a function ¥(#) defined on measurable subsets of the z-axis. 
We can pass from the set function ¥/(#) to a point function F(x) 
by putting F(x) equal to the value of Y(#) on the interval 
E = (a,x). The function F(x) obtained is absolutely continuous 
in the usual sense, for given a system of intervals A), = (a, bx) of 
total length < 6 we have 


D [P(a) ~ Fla) = 2) P(A) = 5) O(6,) = @(L'Ex), (2) 


where &, is the plane set corresponding to the interval A,,. Since 
together with the sum of the lengths of the A, the sum of the 
measures of the 6), is less than 6, each of the sums (2) is less than 
¢ in absolute value, as required. In accordance with the fundamen- 
tal theorem of Section 3 the function F(z) can be expressed as the 
integral of its derivative g(x) = F’ (x): 


F() = fag + F. 


Under the principle of transition the summable function (€) is 
mapped onto a summable function @(x, y) on the plane. We shall 
show that the set function 


G(s) = f fe, y) dedy 


coincides with the original function @(&). Since both @(&) and 
G(&) are additive and absolutely continuous, it is sufficient 
to prove coincidence for sets & which are squares of the 
grill, in view of the fact that by the theorem of Chapter IV, 


316 MATHEMATICAL ANALYSIS 


Section 4 art.3 any measurable set can be approximated by a 
finite sum of such sets to within a set of arbitrarily small measure. 
But if & is a square of the grill, it corresponds to an interval, say 


= (x, 8), of the linear net; hence 
B 
O(S) = Y(A) = F(B) — F(x) = f (a) da, 


Oo 


te 
G(é) = f feEn)dgdy = [ p(x) de, 
& “ 


so that O(¢) = G(&), as required. 
By what we have proved above the function g(#, y) can be 
obtained directly from a ’) by means of a limiting operation 


{ote eat lim Pen) 


Exriny En | 


p(%,y) = lim 

bn ( vy y) bb a 
where &,, is a sequence of ema sets which converges pro- 
perly to the point («, y); the limit exists almost everywhere. We 
thus have the following theorem: 


THEOREM 3. A necessary and sufficient condition for a set function 
®(&) to have density p(x, y) at almost every point and for the formula 


= f fo ndgan, 
é 


to hold for every measurable set & is that ®(€) be additive and ab- 
solutely continuous. 

5. As an application of the theorem just proved we shall find 
the general form of a continuous linear functional f[y] on the 
space L,(D) of all functions integrable over a given region D. 

The condition of continuity of the functional f can be written 
in the form of the inequality 


lflpl| <Cle| = Cf f lp, yl de dy 


with fixed constant C. 

We map the functional f onto the function F(&) of a measurable 
set & which has as value the value of the functional at the charac- 
teristic function yg of &. The function F(&) is additive together 
with the functional / and satisfies the inequality 


IF(8)| = IK SCS [ape y dxdy = Cue, 
é 


DIFFERENTIATION AND INTEGRATION 317 


from which it follows that F(&) is absolutely continuous. By 
theorem 3 it has at almost every point the density 
g(x,y) = lim IG) 
Err(ey En 
Here evidently the absolute value of the function g(x, y) does not 
exceed C. By the same theorem the function F(&) is recoverable 
in terms of g(x, y) by the formula 


F(6) = f [olen dédn = f [zo n) 96.0) dé dy. 
é 


D 


Thus for the characteristic functions of measurable sets we have 
flagl = FC) =f fxg o@, y)dedy. 
D 
The functional 


Algl =f foe@yg@y)dedy, 
D 


is evidently a continuous linear functional on the space LZ, (D). It 

coincides with the functional f[g] at the characteristic functions 

of measurable sets; but then it must coincide with f at all step 

functions and in the limit in general at all functions » € L,(D). 
Thus { = /, and we have proved the following theorem: 


THEOREM: Every continuous linear functional f[g] on the space 
LL, (D) ts of the form 


flel =f fo yg@ydedy, 
D 


where g(x, y) is a bounded measurable function. 
An analogous theorem obviously holds for the case of functions 
of any number of independent variables. 


6. THEOREM 3, proved in art. 4, is the analogue of the theorem 
(Section 3) on the differentiability and integral representation of 
‘an absolutely continuous function of one variable. 

We consider now what analogue exists for the theorem on the 
decomposition of an arbitrary monotone—say non-decreasing— 
continuous function F(x) into absolutely continuous and singular 
components. 

A continuous non-decreasing function F(x) determines a non- 
negative additive function of an interval («, B) of the linear net, 
equal to F(8) — F(x), which can be extended over all finite 
MA. Ila 


318 MATHEMATICAL ANALYSIS 


systems of intervals of the net (using additivity). By the principle 
of transition a non-negative additive function ©(€) will be given 
in the plane on the system of squares of the grill. The function 
@(&) is continuous in the sense that if a square &,, of the grill 


converges to a point (possibly a boundary point), then lim ®(&,,) 
n->o 
= 0. Conversely, any non-negative additive continuous function 


@ (&) defined on the finite system of grill squares maps onto a non- 
negative continuous function of the intervals («, 8) of the linear 
net, and the function F(x) is consequently continuous non- 
decreasing. 

We decompose F(x) into absolutely continuous and singular 
components: 

F(x) = G(x) + Za). 
To this decomposition corresponds a decomposition of the set 
function, also into two components: 
(8) = g(6) + BIE) 

The first component g(&) is absolutely continuous and is ex- 
pressible as the integral of its density. Let us consider what 
property is possessed by the second component 8(&). Let &, be 
any point at which the derivative of the function Z(2) vanishes, 
and let (x9, Yo) be the corresponding point in the plane; further, 
let &, be a sequence of measurable sets of a special kind, viz. 
finite systems of grill squares, which converge properly to the 
point (vp, Yo). The set &, is included in the grill square Q, and 

we, 2auQ,,« > 0. The grill square Q, corresponds to the 
eieeealt (%n; Br) which converges to the point &. Hence 


(En) _— 

“ a = fh ae 
since the derivative of Z(v) vanishes at &. Thus if we define 
density, not in terms of arbitrary measurable subsets, but in 
terms of grill squares and their properly convergent finite sums 
only, the density of the function 3(&) also vanishes almost every- 
where. 

Note. Instead of considering the values of non-absolutely con- 
tinuous set functions @(&) on all sets measurable in the sense of 
Lebesgue, we have been forced to confine ourselves to their values 
on square nets and finite unions of such. This has not been a 
matter of chance. By using a process analogous to the construction 


act 
B(En) SG A(Bu) — Ban) > 0, 


DIFFERENTIATION AND INTEGRATION 319 


of a system of Lebesgue-measurable sets, we can start from inter- 
vals (Chapter IV, Section 4) and define for any non-negative 
countably-additive function O(&) of rectangles a system o(®) 
of sets measurable with respect to the function ®, or more briefly, 
@-measurable. However, tf O®(&) ts continuous but not absolutely 
continuous, the system o(P) certainly does not contain all Lebesque- 
measurable sets. For there always exists in this case a countable 
set &, for which ué&, = 0, O(€,) > 0; further we can find a 
@P-measurable &, c &y, so that & ¢ o(P); at the same time @, is 
Lebesgue-measurable and uw &, = uw éy = 0. 


5. Tux STrettsEs INTEGRAL 


1. In constructing a theory of the integral we started from the 
known values of the integrals of step functions. The integral of 
an ‘‘elementary step’’—a function equal to 1 on an interval 4 and 
0 outside it—was put equal to the length of the interval. Intervals 
of equal length gave rise to integrals having the same value, so 
that under integration the line became a perfectly homogeneous 
manifold, of uniform construction throughout. But many problems 
by their nature prohibit us from regarding the line as uniform. 
In some cases, e.g. a non-uniform string or rod, we discounted 
non-uniformity, by introducing a density variable. Unfortunately 
the introduction of density does not always overcome the difficulty 
(for example, a string loaded with a point bead). The most ex- 
peditious approach to the study of such problems lies in intro- 
ducing an interval measure which allows for non-uniformity of 
the line. 

By an interval on the line we shall understand one of the follow- 
ing five types of set: 

(1) the closed interval [«, 8] (both end-points included); 

(2) the open interval (a, 8) (both end-points excluded) ; 

(3) the semi-interval («, 6] (the right end-point included); 

(4) the semi-interval [«, 8) (the left end-point included); 

(5) the isolated point [x]. 

Let each interval A on the closed interval [a, b] be mapped onto 


some non-negative number @ A so as to satisfy the condition of 
total additivity: if an interval A is the union of intervals 


ILa* 


320 MATHEMATICAL ANALYSIS 


A,, dy, +. A,, -. without common points, then 


We say in such a case that we are given a Stieltjes measure. 

Cases can occur where isolated points have a positive Stieltjes 
measure. However, this can happen comparatively seldom, at 
most for a countable set of points, since the sum of the measures 
of any finite number of points cannot exceed a fixed constant 
c>0. 


Examples 1. 0A is the length of the interval A (Lebesgue measure). 
2. 0A = fp) dz, 
A 


where (x) is a fixed non-negative summable function. 

3. 9 A is equal to 1 for every A containing the point c, and 0 for 
every A not containing c. 

The condition for total additivity can be replaced by two con- 
ditions: for (finite) additivity and continuity. 

(a) The additivity condition: if the intervals 4,, A, are disjoint, 


o(A, + As) = @ A, + @ Ag. 


Naturally, if the additivity property holds for any pair of non- 
overlapping terms, it also holds for any finite number of such 
terms A,, ..., 4): 


o(A, peg A,) =o, See 0 A,,. 


(b) The continuity condition: if a sequence of nested intervals 
A, >A, = ... has A as its intersection, eA = lime A). 

Properties (a) and (b) are easily seen to follow from the total 
additivity condition. The converse, that total additivity is a conse- 
quence of properties (a), (b), is more difficult; it will be proved 
below by using the theory of the integral (p. 323). 

The measure g A can be considered not only on a closed interval 
{a, 6], but also on the entire line — oo < x < oo. If the measure 
is finite for the entire line, there is no difference between this case 
and the case of measure specified on a finite closed interval, so 
that [a, 6] can be replaced throughout what follows by the entire 
line, on condition that its measure be finite. 

We shall now define the integral corresponding to the Stieltjes 
measure. We begin as usual with the integration of step functions. 


DIFFERENTIATION AND INTEGRATION 321 


We partition [@, b] into a finite set of non-overlapping intervals 
A,, Ae, ---; Mn. A function h(x) which assumes a constant value 
h; on the interval A; (j = 1, 2, ..., 2) is said to be a step function. 
We define the Stieltjes integral of h(x) by the formula 


Ih = 2) hy @ Aj. 
jel 


We shall extend this concept of the integral from step functions 
to g-measurable functions—the limits of sequences of step functions, 
as we did in Chapter IV for the Lebesgue integral. An important 
stage in the course of this construction was reached with the 
concept of a set of measure zero. In our new scheme a set Z < [a, b] 
is said to be a set of (Stieltjes) measure zero if for any « > 0 it 
can be covered by a finite or countable system of intervals of total 
Stieltjes measure < «. We observe that now isolated points can 
have positive measure while entire closed intervals may be of 
measure zero (as in example 3). A sequence of functions f,, (x) is 
said to converge almost everywhere if it converges at all points 
of the closed interval, except perhaps on a set of (Stieltjes) 
measure zero. 

In particular, a sequence that converges almost everywhere 
must actually converge at points bearing positive measure, 
whereas it can behave quite arbitrarily on closed intervals of 
measure zero. 

We must define an integral for functions of the class C} com- 
posed of functions f(x) which are the limits (almost everywhere) 
of increasing sequences of step functions. In Chapter IV, Section 2 
this construction was based on two lemmas relating to decreasing 
sequences of step functions and asserting the equivalence of the 
relations h, \) 0, Lh, \; 0. In proving these lemmas it was essential 
that the discontinuities of the functions h, (n = |, 2, ...) should 
constitute a set of measure zero. In our case, if we start from 
arbitrary step functions, this condition will not be satisfied, since 
there is nothing to stop a step function indulging in a jump just 
at a point of positive measure. We extricate ourselves from this 
difficulty very simply: we shall require that the initial step functions 
do not have salti at points of positive measure. (As we saw above, 
there is at most a countable set of such points). 

With the fulfilment of this condition we can reiterate the scheme 
of Chapter IV, Sections 2-3 in its entirety. The (Stieltjes) class 


322 MATHEMATICAL ANALYSIS 


C is composed of the functions f(«) which are limits almost 
everywhere of convergent increasing sequences of step functions 
h(x) with bounded integrals J, h,, and without salti at points of 
the axis which have positive measure. The Stieltjes integral I, f 
is defined by the formula 


I, f = lim I, hn. 


The differences of functions of the class C3 form a class of 
functions LZ, which are said to be summable in the Lebesgue 
Stieltjes sense. 

If gp =f, — fo, where f,, f, € Cf, the Stieltjes integral of the 
function @ is defined by the formula 


I. @ =ITgh — Ife. 


The correctness of all these definitions carries over from the 
scheme given in Chapter IV. 

The integral [,q is termed the Lebesgue—Stieltjes integral with 
respect to the measure o and has the more pedantic denotation 

b 
Ip = J (@)@(da). 
a 

We observe that a step function h(x) with a discontinuity at a 
point of positive measure can always be expressed as the limit 
of an increasing or decreasing sequence of permissible step func- 
tions and will therefore belong to the class C} or to L,, so that 
our restriction on the class of step functions used in constructing 
the integral will not lead to any diminution of the totality of 
summable functions. The class L, is a complete normed linear 
space with the norm 


@ 


lel = J, |p])- 


The spaces LP (p = 1) are defined similarly. o-measurable functions 
are defined in the natural way as the limits of sequences of step 
functions which converge almost everywhere (in the measure g). 
Further, o-measurable sets can be defined as sets the characteristic 
functions of which are o-measurable; the o-measure of a o¢- 
measurable set H is defined by means of the formula 


1 for x€#, 


gH = 1,(xe(7)), tele) = for «GE. 


DIFFERENTIATION AND INTEGRATION 323 


The g-measure is countably additive on a system of o-measurable 
sets, Le. 


o(f, + #,+ +)=e@F, +e, + -- 


if the sets #,, #,, ... are g-measurable and have no common 
points. 

A o-measurable function can now be defined as a function f(x) 
for which the set {x: f(x) < C} is g-measurable for any C. This 
enables us to extend the whole of the Lebesgue theory (Chapter IV, 
Section 4, art. 6) to g-integrals. 

An essential role is played in all these constructions by the 
condition for countable additivity of the g-measure on intervals, 
or the formally weaker conditions for additivity and continuity 
(p. 320). Firstly, these conditions are essential in the de- 
finition of sets of e-measure zero. On the one hand, an interval 
A is of measure zero if it can be ascribed the g-measure g A = 0. 
On the other hand, in accordance with the above conditions, an 
interval A of measure zero can be covered by a system of intervals— 
or even by one interval here—of arbitrarily small measure. The 
continuity condition ensures the equivalence of these definitions. 
Further, by our construction, the measure of an interval 4, one 
or both ends of which are of positive measure, is defined by the 
integral J,(A4), which is the limit of the measures 9 A, of intervals, 
the ends of which are of measure zero; the continuity condition 
ensures that I,(4) = @ A. We note finally that a proof has now 
been obtained via the theory of the integral of the total additivity 
of a finitely-additive, continuous Stieltjes measure. 

2, The measure 9 can be used to determine a non-decreasing 
function 

F(x) = ofa, x]. 


Moreover o@ is uniquely recoverable from the function F(x) for 


S 


any interval; in fact we have: 
ela, x] = F(x), (1) 
ele, 2'] = ola, «") — ofa, 2] = F(a’) — F(a). (2) 


For the semi-interval [a, x) we can write the following: 


38 


ela, #] = 2) 0 Rn; Xn41) + ela, 2] 
ne 
(foe aS <a, lim oad 


n->co 


324 MATHEMATICAL ANALYSIS 


We note that the total additivity of the function @ is used here. 
With the aid of the function F the last relation can be written: 


ola, 2] = 5) LF (tn g1) ~ Pste)] + Fey) 
= lim F(x,) = F(x — 0). 


rn >o 


It is now easy to find g A for all the remaining types of interval: 
o[a, x’) = ofa, x’) — ofa, x) = F(x’ ~ 0) — F(x — 0), (3) 


o(v, «’) = ela, 2’) — gla, a] = Fle’ --0)— Fe), 4) 
o[x, e'] = o[a, x’) — e[a, x) = F(x’) — F(x — 0), (5) 
o[x] = ef{a, x] — e[a, x) = F(x) — F(x — 0). (6) 


The function F(x) defined by (1) is continuous to the right at 
every point, since 


F(x +0)= lim F(a,) = F(x) — (Fla) — Fe] — - 


Vya>rtO 
= o[a, X%] — e[%,, 2] — -- = ola, x] = F(x). 


As shown by equation (6), at each point z of non-zero g-measure, 
the function F(x) has a saltus equal to the e-measure of the point; 
elsewhere F(x) is continuous. 

Conversely, an arbitrary non-decreasing function F(x), con- 
tinuous to the right, determines a non-negative interval function 
in accordance with formulae (1)-(6); it is easily seen that this 
interval function is additive and continuous, and hence countably 
additive. The function F(x) is said to be the generating function 
of the Stieltjes measure 0. 

In the case of finite measure on an infinite interval extending 
to — oo on the left, the corresponding generating function is 
given by the equation 

F(x) = e[— &, x], 


which is formally obtained from (1) by substituting — oo for a. 
We observe that the point — co (and + oo) can also carry a 
positive measure. 

In the general case the Lebesgue-Stieltjes integral constructed 
from the function F(z) is denoted by the symbol 


b 
T,p = f pa) aF(e). (7) 


DIFFERENTIATION AND INTEGRATION 325 


This notation is very convenient and, as we shall see below, it 
conforms with the other universally adopted notations. We draw 
attention to one distinction. Putting m(x) = 1, we get in accor- 
dance with the definition 


b 
{ dF (x) = ofa, 6] = F(b), 


and not F(b) — F(a), as we might expect. Of course no im- 
portance attaches to this exception unless F(a) + 0, ic. unless 
the point a carries positive measure. In relation to the integral (7), 
F(z) is said to be the integrating function. 

3. Let us consider some different types of integrating function. 

(a) The integrating function F(z) is a saltus function: then there 
exist points x, and numbers p,, with 5) p, < co, such that F(x) 
is given by an equation of the form 

F(x) = 2 pa- 
UyzSxr 

As was shown in Section 1, art. 3, F(x) is continuous to the right. 

In this case the measure 0 A of an interval A is the sum of all 
» Which correspond to points x, contained in A. The integral of a 
step function h(x) which takes the value A; on an interval A; is 
equal to 


> eA; a ah; Dy Pu = me h(&n) Pn- 
j i ,E4z n=1 


If the increasing sequence of step functions /, (2) converges almost. 
everywhere in the measure ¢ to h{x) with the I, h, bounded, then 
as y-> co the h,(x) tend to f(x) at each of the points 2, 
(n = 1,2, ...) and 


L,f= lim Ik, = DS) f(r) Pn- 
v>o n=l 
Taking differences, we find that the class DL, is formed by functions 
(x) which are defined (uniquely) only at the point z, (m = 1, 2, ...), 
with 
I,@ =. 2 Pn) Pn and 2! Pula) | < &. 
It is easily verified that any function (x) with 5) p,|y(an)| < co 


falls in the class L,. This completely determines L,. 
(b) The integrating function F(x) is absolutely continuous. 


326 MATHEMATICAL ANALYSIS 


Then the Stieltjes measure on any interval is of the form 
A 
ofa, B] = el, B) = o(«, B) = e(x, A] = F(B) — F(a) = { g(é) 4, 


a 


where g(&) = 0 is the derivative of F(x). There are no points with 
positive g-measure. The integral of a step function h(x) taking 


the value h; on A; is equal to 
b 


T,(b) = She Aj = Shy f g(E)ag = f h(é) gS) dé = 1 (hg), 
Aj a 
where I denotes the Lebesgue integral. 

If the sequence h,, is increasing and converges almost every- 
where in the measure o to a function f, with the integrals J, h,, 
bounded, then the Lebesgue integrals /(h, g) are bounded and 
the limit ¢ g of the sequence h, g is consequently summable in the 
usual sense for a function; we then have 


I, f = lim I, h, = lim I(h, g) = I(fg). 


Taking differences, we see that the class LZ, consists of the functions 
y for which the products m g are summable in the usual sense; we 
then have for any  € L, 

b b ; 

[pak =1,p =1g9) = f pgax (1) 
(so that in this case we can replace dF by F’ (x) da = g dz). 

Every set of Lebesgue measure zero (we shall use the abbre- 

viation: L£-measure 0) can be covered by a finite or countable 
system S of intervals of arbitrarily small overall length; the o- 
measure of this system is equal to the integral over S of the func- 
tion g(x) and therefore tends to zero together with the ordinary 
measure of S. [t follows that every set of L-measure 0 is a set of 
o-measure 0. We shall consider an arbitrary L-measurable set £. 
We know that there exist closed sets F and open sets G, Fc Hc G, 
such that the difference G—F has arbitrarily small L-measure. 
The sets G and F are g-measurable, and by what has been proved, 
the g-measure of G'—F is also arbitrarily small; it follows that 
the set Z is g-measurable. Thus any L-measurable set is also o- 
measurable. Moreover by formula (1) the g-measure of £ is equal to 


Tyr =I (x29) = { gla) de. 
E 


DIFFERENTIATION AND INTEGRATION 327 


Let us consider now the structure of g-measurable sets. Let HE be 
a g-measurable set and let y(x) be its characteristic function; then 
by what we have proved yg is summable in the usual sense and 


b b 
gE =f x(e)aF(x) = f xgde. 


Let E, be the set on which y g > 0; it is contained in E and is L- 
measurable. The set E itself may be unmeasurable in the ordinary 
sense, but, representing it in the form 


E=B,+(B—£)), 


we see that ié can be expressed as the sum of two sets such that one 
is L-measurable and g(x) vanishes on the other. The converse is 
also true: if some set EL can be expressed as the sum of two disjoint 
sets E,, EH, such that one is L-measurable and g(x) vanishes on the 
other, then E is o-measurable. Since an L-meagurable set is o- 
measurable, it is enough to show that the set #, on which g(x) 
vanishes is g-measurable (and has g-measure zero). But the set Hy 
of all points at which g(x) vanishes is measurable in the ordinary 
sense. Hence it is o-measurable and its o-measure is 


eH = [gdu=0. 
Ey 


It follows that any subset HC Hy is o-measurable and has e- 
measure zero, which completes the proof. 

It was shown above that if the function y is e-summable, the 
product gg is L-summable. We shall show that the converse is 
also true: if for some function m the product yg is L-summable, 
then @ is e-summable. We show first that g is e-measurable. For 
a given C let us consider the set # on which g(x) <= C. This set 
coincides with the set #’ on which g(x) g(x) < C g(x), discounting 
some set #”’ on which g(x) vanishes. The set E#’ is L-measurable 
and consequently o-measurable, while Z’’ is of @-measure zero; 
hence H is g-measurable. Since C' is arbitrary, we infer that the 
function g is e-measurable. To prove that is e-summable it is 
enough to show that the integrals I,m y are bounded. where 
gy = min (|¢|, NV). But the function py is bounded and 9-measurable 
and consequently o-summable, so that in accordance with formula 
(1) 

Ten = Tn 9) ST (le! 9), 


328 MATHEMATICAL ANALYSIS 


as required. We have thus obtained a complete description of 
e-summable functions: they are those and only those functions (x) 
for which the product y g is summable in the ordinary sense. 

4, It is also possible in the general case when the integrating 
function F(x) is an arbitrary non-decreasing function to give 
some definitive though less effective property of the corresponding 
space L,. For this we consider the map of the 2-axis into the 
y-axis determined by the function y = F(x). At points of continuity 
this map is well-defined; we shall map a point of discontinuity 2 
onto the entire interval [F(a — 0), F(x )]. The inverse function 
x = G(y) has the same properties; each point y can be mapped 
by it onto a single point x or onto an entire interval, the latter 
only in the case that there exists an interval of the x-axis on 
which F(x) maintains a constant value y, (the set of such values 
Yo is at most countable). Each set £ on the a-axis is mapped onto 
some set & on the y-axis; moreover an interval 4 of the z-axis 
maps onto an interval of the y-axis with length exactly equal to 
the g-measure of A. To each function f(x) corresponds a function 
g(y) = {(G(y)), defined for those y for which G(y) is single-valued. 
Thus the function g(y) can be indeterminate at most on a fixed 
countable set. A step function (x), equal to A; on the interval 4,, 
determines a corresponding step function k(y) equal to h; on the 
interval F(4;); moreover 


Th= Sheedy = Sh PF(A)| =k, 


where (F(A;)) denotes the length of the interval F(d;). We see 
that under the given map a step function / on the x-axis corresponds 
to a step function k on the y-axis with the ordinary integral J k 
equal to the g-integral of h. 

It is now clear that the limiting process which produces a 0- 
summable function v(x) on the x-axis has an image on the y-axis 
which leads to the function y(y) = y(@(y)), which is summable 
in the ordinary sense. Thus if ¢ (x) is e-summable, the corresponding 
function y(y) = »(G(y)) is summable in the sense of Lebesgue and 
I, = I. In addition p(y) is constant on every interval that 
corresponds to a point x of positive g-measure. Conversely if the 
function y(y) is summable in the sense of Lebesgue and constant 
on all the intervals that correspond to points x of positive o- 
measure, then it can be approximated with respect to the metric 
of L, by step functions which are also constant on these intervals; 


DIFFERENTIATION AND INTEGRATION 329 


it follows that the function p(x) = p(F(x)) is o-summable and 
I,g =Iy. Thus a function p(x) ts o-summable if and only if the 
function p(y) = p(Gly)) ts summable in the ordinary sense. In 
particular, a set # on the x-axis is g-measurable if and only if the 
corresponding set F'(#) is measurable in the sense of Lebesgue. 

A particular consequence of these results is that we can obtain 
a generalisation of formula (1] of art. 3, which relates to the case 
of an absolutely continuous integrating function, for the general 
case. 

We shall call a function ®(x) absolutely continuous with respect 
to the non-decreasing function F(x) if for any ¢ > 0 there exists 
6 > 0 such that for any system of non-intersecting intervals 
(a,,, by) (k = 1, 2, ... m) the condition 


LF (bp) ~ F(a)] <6 
entails the inequality 
Py |B (dx) = D (a,) | <é. 


We can say that ® is absolutely continuous relative to F say 
if ® satisfies a “Lipschitz condition” relative to F, i.e. for any 
o and B, 

|P(B) — P(x)| S[F(B) — F(a) eC 


with a fixed constant C. 

By what we have proved above the function ®(@(y)) = W(y) 
is absolutely continuous with respect to the ordinary Lebesgue 
measure on the y-axis and can therefore be expressed as the inte- 
gral of its derivative p(y): 


y 

Py) =f yj)an (4 = F(a). 
This in turn signifies that the function @(x) is the integral of the 
function g(x) = y(G(y)) with respect to the measure d F: 


P(x) = f g(é) AF (é). 


We note that g(x) is bounded in absolute value by the constant 
C when the Lipschitz condition (2) is satisfied. 

Let us suppose further that @ is also non-decreasing. Then it is 
easily verified in the same way that every function / (x) summable 


330 MATHEMATICAL ANALYSIS 


in the measure d@ is also summable in the measure dF, and 


J ieaoe) = ft x) 9 (v) dF (a). 


The considerations which relate the Stieltjes and Lebesgue 
integrals will be developed at the end of the next paragraph in 
application to the case of functions of several variables. 


6. THE STIELTJES INTEGRAL (CONTINUED) 


1. In the last paragraph the measure which defined the Stieltjes 
integral was supposed non-negative. But in fact our procedure can 
be extended without difficulty to certain measures to which non- 
positive values can be assigned. 

We shall find it more convenient now to speak, not of actual 
measures, but of their generating functions. Let ®(x) be some 
function of bounded variation on the closed interval [a, b] 
(- 0c Sa<b S ov), continuous to the right. Like every such 
function it can be expressed as the difference of two non-decreasing 
functions, also continuous to the right, one of which is the total 
variation of @(v) (Section 2): 


D(x) = V(x) — G(x). 


The function ® satisfies a Lipschitz condition with constant 1 
relative to V: 


|®(B) — P(x)| = Val] = V(B) — V(a). 
Hence, by (6), 


= fh(gaveg. 


Going by analogy with (7), we put by definition, for any function 
g(x), summable with respect to V (x): 


b b 
f e@) d@ = f ple) h(a) dV (a). (1) 


This defines the integral with respect to B(x). We recall our 
assumption that ®(x) is of bounded variation, and not necessarily 
a non-decreasing function. 


DIFFERENTIATION AND INTEGRATION 331 


Integral (1) can also be expressed directly in terms of integrals 
with respect to non-decreasing functions; we have 


b 
G(x) = V(x) — B(x) = ffl — Weave), 


or : ; : 
fodG = [(L—h)padV = fpadV —fophav 
soctigk a a a 
: i 
JphaV = [paV — [eae (2) 


Finally, the integral defined by equation (1) can be obtained 
by a direct construction, analogous to the construction of an 
integral with respect to a non-negative measure. Let 9 A be the 
measure corresponding to the generating function V(«) and o A 
the measure corresponding to G(x). In accordance with formulae 
(2)-(6) Section 5, art. 2, we construct a measure t A derived from 


P(x): 


t(x, x'] = O(x') — D(x), t{x, x’) = Div’ — 0) — P(x — 0), 
t(u, x’) = D(x’ — 0) — D(x), tlw, v’] = D(a’) — O(x — O). 


The interval function t A is additive and continuous; this can be 
seen immediately from the definition or from the equation 
tA =oA —oA. It can also assume negative values; but for any 
partition of the closed interval [a, }} 


[a,b] =A, + Ag + - +dm 


the condition 


|v A, | + |t Ag} + oS + |t Au} Ss » 04; + aA) <= C (3) 
j= 


Jal 


is satisfied, where C is a fixed constant. The inequality (3) expresses 
the fact that the original function ®(x) is of bounded variation. 
Further, notwithstanding that tA can take negative values, we 
can define the integrals of step functions with respect to the 
measure t 44 and then obtain the space L, by the usual limiting 
process. In all evaluations of bounds the constant C in (3) will have 


332 MATHEMATICAL ANALYSIS 


to take the place of @[a, b]. Using the equation oA — od = tA at 
each step, we arrive at formula (2), and together with this for- 
mula (1). 

2. The Riemann-Stieltjes integral. Given an arbitrary function 
f(x) and some generating function ®(x) (of bounded variation), let 
us consider the sums analogous to the Riemann sums 


1 
2 £ (Ej) [P(x 44) — (x,)] (1) 
(@ =a <a <a, = ba, SF; Saj,,). 


The limit of such sums when the partitioning intervals Aa; 
= %j,, — a; are indefinitely refined is said, if it exists, to be the 
Riemann—Stieltjes integral of the function f(x) with respect to the 
function ®(x). We shall show that it exists and coincides with the 
already defined Lebesgue-Stieltjes integral of f with respect to ® 
over the interval (a, b] if f(x) is continuous. The integral sum ex- 
pressed is the Lebesgue-Stieltjes integral of the step function h, (x), 
defined on the interval (a, 6] and equal to /(&;) on the inter¥al 
Aj = (x;, x; ,,]. When the A; are refined indefinitely the step func- 
tion h,,(x) tends uniformly to f(x); hence in virtue of the funda- 
mental theorems of integral theory 


SHE) PC.) — Ge) = f huleydh > f fede, 
(a, b] (a, b] 
as required. 
The actual definition of the Riemann-Stieltjes integral gives rise 
to the inequality 


b 


J f@) a®(z) 


= sup |f(%)| Val], 
wE (a, b] 


which replaces the bound 


< 


b 
f f(x) da 


sup {f(x)| (6 — a) 
| bSesb 


for the ordinary Riemann integral. 

We observe that in defining the Riemann-Stieltjes integral 
there is no need to ensure that the function ®(z) is continuous to 
the right. In fact if f(x) is continuous the integral sums (1) have 


DIFFERENTIATION AND INTEGRATION 333 


a limit for any function ®(2) of bounded variation and this limit 
is independent of the values of ® at its discontinuities. For the 
proof, given some function ®@(x) of bounded variation, we put 


P(x) = Oy(x) + D(z), 


where ®,(x) coincides with ®(x) at all the points of continuity of 
@(x) and is equal to @(z + 0) at its discontinuities, being there- 
fore continuous to the right; the function D(z) is evidently distinct 
from zero on an at most countable set 2,, z,, ... of discontinuities 
of B(x). 

By what we have proved, the integral sums corresponding to the 
function ©,(xv) have the limit 


5 
f fle) d®y(a). 


We shall show that the integral sums corresponding to D(x) tend 

to zero. Since, together with @(x) and ©) (x), D(x) is of bounded 

variation, the sum of the moduli of all its values is finite. For a 

givene > 0 we find a number N such that Y | D(z,)| < ¢. Further, 
n>wN 


we put 
D(x) = D,(x) + + + Dy (x) + Dy(@), 


where D,(x) is non-zero only at the point z; (j = 1, 2, .... N) and 
Dy (x) is non-zero only at the points z,(n > N). The integral sum 
corresponding to the function D; (x) is equal either to 0 or to 


CF(E;) — £(E})) Dj (%), 


where & =< z; < &’ and &, &’, lie in adjacent elements of the parti- 
tion of [a, b]. Since f(x) is continuous, this quantity can be made 
arbitrarily small by a suitable refinement of the partition. The 


integral sum corresponding to the function D(x) admits the bound 


SIG) [Dy (a; .1) =< Dy (z;)] 2M 2) |PG)| <2Me, 


where M = max |f(x)|; as we see, this sum can also be made 
arbitrarily small. Thus the integral sums corresponding to the 
function D(x) do in fact tend to zero as the partition is increasingly 
refined, and our assertion is proved. 


334 MATHEMATICAL ANALYSIS 


Problems. 1. Find the values of the Stieltjes integrals: 
a | 0 for c=—l, 
T= fxdF(x), F(e)={ 1 for -l<2<2, 


a [1 for 2<2n<83; 
1 

—1 for Osa<-y, 

0 for tenes, 


Ins fx AF (x), F(x) = 


—2 for -<4#<2. 


Answer. I, = —5; I, = —17/4. 

2. Write down an expression for the moment about the origin of a mass 
distributed over the closed interval {a, 6] in such a way that the mass of the 
closed interval [a, #] is equal to F(x). 

b 


Answer. M = f xd F(x). 


3. The Limiting Passage under the Riemann—Stieltjes Integral Sign 


THEOREM 1. (HE. Helly). If the functions F,,() of bounded variation 
converge at each point of the closed interval a <a <b to some func- 
tion F(x) and the total variations of all the F,,(x) have a common 
bound: 

Ve LFal SC, 


then the limiting function F(x) ts also bounded variation and for 
any continuous function ¢ (x) 


b b 
lim f ele) aF,,(#) = f el) AF (x). (1) 
n-»OO a a 
Proof. We begin by showing that F(x) has a bounded variation 
not exceeding C. For given any partition @ = % < %,< ++ <2, 
= b of [a, b] we have: 
m-1 m-1 
& |P(@a1) — Fe) = tim DS" | Fa @jai) — Fr(@)| S ¢, 
j=0 n-»oo j=0 
and it follows that V?[F] < C. 
We now proceed to the proof of relation (1). First let w(x) be 
a step function, equal to h; on the interval 


Aj = (%;, 241). 


DIFFERENTIATION AND INTEGRATION 335 


Then 


b 
f e@) AF, (x) = Y ALP) — File). 


| p(x) AF (x) = SI ALF (aj 2.) — F(a). 


It is obvious that for sufficiently large n = N these expressions 
differ by less than a given ¢ > 0. In the general case for a given 
é > 0 we construct a step function @, (x) such that [¢(x) ~ @, (x)| 
< é/C. Then 


b 
f pla) dF, (@) — i pe (et re) 


— 9,(x)] AF, (0 Nis <7 VLF Se, 


b 


b | 
fe@)dF@) — fe) are) 


a 


b 
J[lp@ — g.@)aF(@)| = a VELF] Se, 
and hence for x > N 
if 
Joo dF, (x - foe )d F(x viene ) — 9, (e)] AF, (x y+ 
{ ia { 
l 6b 
fig@ ~ ged Fe (e) dF, (x) — Foam 2) d F(x) 


which proves the theorem. 


Note 1. This theorem can be generalised somewhat by allowing 
the integrand g(x) to depend on n. We claim that the relation 


b 


b 
fp@)dF(2) = lim | gw) dF, (2), 


> OO a 


holds if the following conditions are satisfied: 
(a) the functions F,,(x) are of uniformly bounded variation and 
converge to F(x) at every point of [a, b]; 


336 MATHEMATICAL ANALYSIS 


(b) the continuous functions »,, (x) converge uniformly to their 
limit g(a). 
The proof follows quickly from the bounds 


6 
ple) ~ pa le)] AP, (a ()| S max (0) — gn (e)| V8 LE, 


— Pn()] AF (x)| S max |p (w) — pn(w)| ValF} 


in conjunction with the theorem just proved. 

Note 2, Theorem 2 (along with note 1) can be generalised in an 
obvious way to an infinite interval of integration, say [0, oo], if 
the functions F,,(x) are of uniformly bounded variation through- 
out the interval and (x) (or ¢,(«)), if we are speaking of note (1) 
is continuous, including at infinity; this latter property enables 
g(x) to be uniformly approximated by step functions, which forms 
an essential part of the proof. 

Note 3. If w(x) is merely bounded instead of continuous at 
infinity, so that |g(#)| < M, Helly’s theorem remains valid if the 
F(x) satisfy the following supplementary condition (the role of 
which amounts to the fact that the mass carried by the distribution 
F,,(z) does not become infinite as m increases): 

(*) Given any ¢ > 0, an N = N(e) can be found such that, for 
all n, 

Var F(a“) Se. (2) 


|e] 2 N 
For, on passing to the limit in (2) as x -> co, we find that 


Var F(x) Se. 


ja, 2 


or the limit function F(x) also. 
Further, having found the N from condition (*) for a givene = 0, 
we have 


(os) N 
foe )AF,. (x) — [ p(x) d F(x) = f pe) dlFa(x) — F@)] + 
-7 ~N 


+ f pe)dF(z)— f pe)dF@). (3) 
(vj NV |jz 2N 


DIFFERENTIATION AND INTEGRATION 337 


Knowing N, and using Helly’s theorem for the finite interval 
[—N, N], we can find an n, such that, for n > 1, 


N i 
J p(x) ALF, (x) — P(e) <e. 
~N 


The remaining two integrals do not exceed 2 M ¢ in absolute value, 
by construction; we see that the entire left-hand side of (3) does not 
exceed (2M + 1) ¢, which justifies the passage to the limit under 
the Stieltjes integral sign in the case in question. 

4. The application of theorem 1 is much facilitated by the 
following theorem, which allows the extraction of a convergent 
sequence from a given set of functions of (uniformly) bounded 
variation. 


THEOREM 2. (EK. Helly). From any infinite set K of functions f (x) 
which are defined on the closed interval a <a <6 and have the pro- 
perties 

max |f(x)| <C, (1) 


Vilfl sv (2) 


(where C, V are constants independent of the choice of f € K), it és 
possible to choose a sequence f,, (x) which converges at every point of 
the intervala Su <b. 

Proof. Let us suppose first that the functions / (x) are non-decreas- 
ing. Let r,, ..., 7,, -. be a sequence containing all the rational 
points of [a, b]. Since the numbers f(r) are bounded, there exists 
a sequence of functions /,, € K for which the numbers /,,, (7,) tend 
to a limit. From this sequence f,,(w) we can extract a subsequence 
fae(v) for which the numbers f,.(7r2) converge (and also of course 
the numbers f,.(7,)); continuing in this way we get for cach k a 


sequence f,;(%) which converges at the points 7, 7g, ..., %} as 
n-» oo. The diagonal sequence fr,(x), which we saall denote 
simply by f,(v), converges at each « =7,,7,, ... The limit f(z) 


e* the sequence f, (x), defined as yt w.cy at the rational points, 
represents a non-decreasing function. We complete its definition 
by putting for each irrational x 
f(x) = lim f(r) (r rational). 
r—-»x-0 
As a result we get a non-decreasing function defined at all points 
of the interval [a, b]. We shall show, that it remains the limit of 


338 MATHEMATICAL ANALYSIS 


the sequence f,,(x) wherever it is continuous. Let 2, be a point at 
which f(x) is continuous; given ¢ > 0, we find 6 > 0 such that 
| f(z) — f(x9)| <¢€ whenever |x — a| <6, and choose rational 
points r° < xv <r” such that 7’ < x 4+ 6, 1 > a — 6. We also 
find a number N such that for n = N we have |f,(7") — f(r’)| <e, 
fale”) — f(r)| < e. It follows that |f,(7") — f,(r")| < 4e. Since 
the function 7, (#) is non-decreasing, the number f,, (24) lies between 
fa(r’) and f,,(r’’); hence 
Li (wa) ~ inl) 
<— if (ao) aah f(r’) | at: [f(r’) aa f(r’) | =f [fn lr’) _ 1 (o) | 
sete |filt’) — frlr’)| S 6e, 
and it follows that f(x) = lim f, (29). 

The sequence f,(x) constructed converges to f(x) everywhere 
with the possible exception of discontinuities of the function f(z). 
The set of these discontinuities is at most countable. Hence, again 
applying the diagonal process, we can extract from the sequence 
f(z) a subsequence which converges even at the discontinuities of 
f(z). We have thus extracted from the given family of non-decreas- 
ing functions a sequence which converges at every point of [a, 6] 
as required. 

In the general case when the functions { (7) € K are not necessa- 
rily non-decreasing we express each of them in the form 

f(z) = V(x) — G(z), 

where V(x) is the total variation of f(x) over the closed interval 
[a, #]. The functions V (x) are non-decreasing and, by condition (2), 
bounded; the functions G (a2) have the same properties. By what 
we have proved there exists a sequence f,,(x) € K for which the 
functions V,,(x) converge at every point of the interval [a, )]. 
From this sequence we can extract a subsequence f, (2) for which 
the functions G,,(v) converge at every point of [a, b]; but then the 
fax (x) also converge at every point of [a, 6]. The proof of theorem 2 
is therefore complete. 


5. The Stieltjes Integral for Several Variables 


For definiteness we shall discuss the case of two variables 2, y, 
varying over the square Ay = {a <x <b,a <y <b}. We shall 
call an “interval” in the plane any set of points (x, y) where each 
of the coordinates runs through some interval (any of the five 
types given in art. 1) on its own axis. Let us suppose that for each 


DIFFERENTIATION AND INTEGRATION 339 


interval A in the square A, a function 9 A = 0 is given, satisfying 
the condition of total additivity: if A,, A,, ...,A,, ... are mutually 
disjoint, and their union A is also an interval, then 

04 =e4,4+ + +e4, 4+ -- 
As above, we shall the function @ A the Stieltjes measure. We parti- 
tion the square A, into a finite number of disjoint intervals 


A,, ..., dn; a function h(x, y) equal to a constant h; on the set A; 
is said to be a step function and its integral is defined by the for- 
mula Ieh= Shodj. 

Jj 


Applying the process already described many times above, we can 
extend the integral J 9 over a wide class L, of functions which 
are then said to be summable in the Lebesgue—Stieltjes sense with 
respect to the measure 9. The simplest way of ascertaining the com- 
pass of the class and validating the construction is to apply a 
method, similar to that used in Section 4 for Lebesgue measure, 
of setting up a correspondence with linear measure. The method 
is as follows. From the measure g A we construct two functions, 
each of one variable: 
F, (2) =0A,, F,(y) = ody. 

Here A,, 4, denote the regions of the plane the points (¢, 4) of 
which satisfy the inequalities § < x, S y respectively. The func- 
tions F(x), F,(y) are non-decreasing and therefore possess at 
most countable sets of discontinuities. Let 2, %, ..., Zn, --- be all 
the discontinuities of F(x). We can construct on the «-axis a set 


of numbers of the form a) + ea (a fixed; p, g integers) not con- 


taining any of the points 2, ..., %,,, ...; this follows from the fact 
that the set ft + # for all n, p,q = 1, 2, ... is countable and 
consequently does not contain all the points of the axis; as x» we 
take any point not contained in it. Similarly we construct on the 
y-axis a set of numbers y, + £ not containing any of the 
discontinuities of F, (y). The straight lines 7 = a + ot 
Y= Yo + (p =0, £1, +2, ...3 ¢ oe form a neta parti- 
tion of the plane into squares of side or’ the boundaries of which 


do not have positive measure. We can now set up a correspondence 


340 MATHEMATICAL ANALYSIS 


between the squares of the plane net and the intervals on the axis, 
with one modification just as we did in Section 4: there we mapped 
a square onto an interval of length equal to the area of the square, 
but now we shall map it onto an interval of length equal to the 
o-measure of the square. Of course we have to take care that in- 
clusion relations are preserved. 

Just as in Section 4 we satisfy ourselves that the map constructed 
is one-one up to a set of g-measure zero in the plane and a set 
of Lebesgue measure zero on the line. To every o-measurable set 
in the plane will correspond a Lebesgue-measurable set on the line 
with the same Lebesgue measure. We observe certain peculiarities 
of this correspondence. A point (x, ¥g) in the plane, of positive o- 
measure, goes over into an entire interval on the line of corre- 
sponding length. The aggregate of such intervals will be at most 
countable and we denote them by 6;, ...,6,, --- On the other hand 
a square of the plane net of measure zero maps onto a solitary point 
of the v-axis. 

To a step function on the plane which takes constant values on 
the squares of the net, corresponds a step function on the line, 
constant on each of the intervals 6,, 6,, ... The integrals of corre- 
sponding step functions—one with respect to plane g-measure, the 
other with respect to linear Lebesgue measure—are equal. By 
means of a limiting passage through the specified linear step func- 
tions we can. obtain all the summable functions that are constant 
on the intervals 6,, 69, ...,6,, ---; the corresponding limiting 
passage in the plane will yield the totality of functions summable 
with respect to the measure g. To each property of the Lebesgue 
integral on the line which relates to functions constant on the 
intervals {,} corresponds some property of the Lebesgue—Stieltjes 
integral with measure 9. Thus the given correspondence absolves 
us from the need to prove specially for the Lebesgue-Stieltjes inte- 
gral all the theorems that we proved (Chapter IV and VI) for the 
Lebesgue integral; all these theorems carry over automatically to 
the Lebesgue-Stieltjes integral. 


6. The Generating Function in the Case of Several Variables 


Let (&, 4) be an arbitrary point of the square A, and let A;, be the 
interval determined by the inequalities n < & y <n. We put 


F(E,n) = @ Aen. 


DIFFERENTIATION AND INTEGRATION 341 


In accordance with the formulae analogous to (2)-(6) of Section 5, 
art. 2, the function F'(&, 7) allows the recovery of the measure o A 
for every interval A and is therefore said be the generating func- 
tion of the measure @ A. It is ‘‘non-decreasing”’ in the sense that 
foré S&H S71’ we have 


F(é,y) s F(é,n'), 
and is also “continuous above’’, i.e. 


FE+0,4 +0) = lim F(x, y) = F(E,n). 
ur->E+ 0 
yrnsto 
Conversely, every function that is non-decreasing and continuous 
above in the sense indicated can serve as the generating function 
for some totally additive measure. 
The Stieltjes integral of a function g(x, y), constructed from the 
generating function F(é, 1) is written in the form 


bb 
f foley) AF (w, y). 


In place of a non-negative measure g@ A we can consider a measure 
0 A of bounded variation and of arbitrary sign; this means that for 
any partition of the base square A, into intervals A,, ..., 4, we 
have 


lo Ay} qe he |oA,,| <C (1) 


where C is some fixed constant. Correspondingly we can consider 
a generating function F(«, y) of bounded variation in place of a 
non-decreasing generating function; this implies that the measure 
oA constructed from F(z, y) in accordance with the general rules 
must satisfy inequality (1). Of course the assumption of continuity 
above, which ensures the total additivity of the measure g A, must 
be retained in the more general case. 


7. APPLICATIONS OF THE StreLtsES INTEGRAL IN ANALYSIS 


The Stieltjes integral has numerous applications. In this para- 
graph we shall give the derivation of three formulae from three 
different. branches of mathematical analysis. One further formula— 
the representation of a positive-definite function—will be given in 
Chapter VII, Section 7. 


MA. 12 


342 MATHEMATICAL ANALYSIS 


1. Linear Functionals on the Space C (a, b) 


The simplest continuous linear functional on the space O(a, 6) 
of all continuous functions g(a) on the closed interval [a, b] is the 
value of the function y (x) at a fixed point « = €. It turns out that 
the general form of a linear functional on this space is obtained as 
a “Stieltjes combination”’ of such simple functionals; we have the 
following theorem: 


THEOREM 1. (F. Riesz). Every continuous linear functional fl] 
on the space C'(a, b) can be written in the form 


b 
i(g) = [ p@)4F(), (2) 


where F(&) is some function of bounded variation, continuous to the 
right, 

The proof will be carried out in several stages. 

I. Let us consider a linear space K of bounded functions gy (x) 
defined on some set X. We suppose that a linear functional f[¢] 
on XK is given, satisfying the condition 


lf{y]| = C sup |p(a)| (2) 


where C is a fixed constant. 

We claim that for every bounded increasing sequence , (x) € K 
the corresponding sequence f[p,] converges. For, given m € K, we 
can write 


Iftel| = + fl) = fl+¢] 


with the appropriate choice of sign. For a given bounded (by M, 
say) increasing sequence ¢,, (7) € K we can form the series 


[flpe — gill + lflys — vel] + + [flPnoa = Gal arate 


With the reservation mentioned the nth partial sum of this series 
can be written in the form 


ALE @2 — Mi) + fl: Gs — G2)) + + AL4 Gna — Gnd] 
ar f(t (Pe G1) = (3 @2) pay ake (Pn41 — Pn)]- 


But 


xo (Pe 1) ae ete (Pn41 Pn)| 
Ss (P2 1) es (Pna4 Gn) = @n+i — Fi — 2M, 


DIFFERENTIATION AND INTEGRATION 343 


so that in virtue of (2) 
[flee — Gal] + + + [flere — Gn]| S2MC 


and consequently the series 


fle — G1] + fies — Gel + + + f[Qnes — Gal + 


converges; but this implies the convergence of the sequence {[¢,]. 

The following query naturally arises: is the limiting value of 
f[gn] for two distinct increasing sequences ¢, that converge to the 
same function g the same? 

We confine ourselves here to the following simple assertion: 
starting from given sequences Gi, AY; Pn 7p we construct the strictly in- 
creasing sequences Pn = Gn — 1/n, Wn = Yn — fn; if for any n there 
exist k, m such that Ym > Gn: Pir > Yn, then lim flp,] = lim f[y,]. 
For in this case we can construct the new sequence 


#1 < Pri < Gn < Pns <i, 
which also converges to y; by what we have proved the values of 
the functional f on this sequence have a limit; but this means that 
the numbers f[gp], f[yn] tend to one and the same limit. 

The same thing applies of course to decreasing sequences. 

Such a situation always occurs, for example, in the case where K 
is the space C' of all continuous functions on a closed interval or 
on a closed bounded set in p-dimensional space. In fact let us fix n 
and let m increase without bound. We shall suppose that for each 
m the set H,, on which p,, — 1/m Sq, — 1/n is non-empty. The 
closed sets #,, form a decreasing sequence (HZ, > £, > ---) and if 
they are all non-empty, they will contain a common point 2p. 
Proceeding to the limit as m > co in the inequality 


1 1 
Pm (Xp) — os S Gn(%) — — 


we get 


1 
P(X) = Ga(Xo) — a 


or 


1 
Pn{Xo) 2 P(X) + = 


which is impossible. 
Thus in the case of the space C' the functional / can be uniquely 
defined on the class C* of bounded functions which are limits of 


12* 


344 MATHEMATICAL ANALYSIS 


increasing sequences of continuous functions. The equation 
fly + yl = fle] + fly] obviously remains valid on the class C+. 
Let us form the class # of differences g = y — yp, pE Ct, pE Ct, 
and put f[g] = f[~] — fly]. This definition leads to a unique result 
(cf. Chapter IV, Section 2). The functional / remains additive and 
homogeneous over the whole class A. If the class K contains 
max {p, y}, min {g, y} together with the functions 9, ¢, inequality 
(2) continues to hold on the class R; more precisely, g € RF implies 
[f[g]| = C sup {g(a)|. (3) 
For let 
J=P- Ys. PrAPs PAY (Gas Pr€ K). 

We put sup |m(x) — p(a)| = wv. If sup |p, (%) — yn(x)| Sp for 
all , the required inequality (3) is obtained by proceeding to the 
limit in the inequality 

[f[gn — Yall] SC sup |gn(v) — palx)|, 
which holds since , — y, € K. In the general case we replace 
the function ¢, by gy, by means of the formula 

Gn = max [Yn — fp, min (Qn; Yn + BI, 
which “truncates” @, at the limits y, +: u. Together with ¢,, y,,, 
the functions Pn are also increasing and tend to g. But now 
[Gn — Yn| Sw and our assertion is therefore correct. 

II. As the space K let us take the space C (a, 6) of all continuous 
functions g(x) on the closed interval [a,b]. According to I the 
given continuous linear functional f[y] has a unique extension to 
some class of discontinuous functions; without describing this class 
completely, we observe only that it contains the characteristic 
functions of all intervals on the line. We define the function 


F(E) = flytaq(@)], 


where Y;4,¢(") is equal to 1 fora Su = and 0 foré<a<b. 
We shall now show that F(&) is of bounded variation. Let 
a=& <& < -+- < &, = 5b be some partition of [a, 6]; we 
obtain a bound for the sum 


n-1 
4 IF (§} 41) — P(E]. 
Evidently 


VF (Sj) — F(&)| = £0F (G43) — FE) = +S legs, 40 (21 
= fLEXG, fn) 


DIFFERENTIATION AND INTEGRATION 345 


and 
n-l ‘S 1 


SFG.) — FEM =F FS = nese 0) 
j=O 


The absolute value of the function in square brackets is at most 1; 
hence in virtue of ee (3) 


3 PG) ~ FEN €e, 


and it follows that F(x) is of bounded variation. 
We shall show further that F(x) is continuous to the right, ie. 
for any ¢ < b and any sequence &, \ & 


F(&) = lim F(&,). (4) 
| 
Unt 
Pat 
a BE &, mee 
| 
Cth ! 
Ont in 
a beh EF 
Fie. 13 


The sequence y,,(%) = ja, s,)(#) is decreasing and tends to the func- 
tion p(x) = Y1a,¢)(z). The numbers /[y,], f[y] are determined un- 
iquely ; in particular, {[y] is the limit of the numbers /[y,], where 
Yn(x) is a continuous function, equal, say, to 1 for m S& + I/n 
and to 0 for x = & — 2/n and linear over the interval [€ +- 1/n, 
& + 2/n}. 

To establish the required relation f[y,] > f[y], it is sufficient 
to show that for any 7 we can find &, m such that 


1 1 1 
Var Se Ge Pe +5 < Yn > 


But by an elementary geometrical construction (Fig. 13) it is easily 
seen that for a given 7 the required k, m always exists. The rela- 
tion (4) therefore holds. 

We observe in particular that if the functional f[@] is non-nega- 
tive, ie. takes values =0 for functions v(x) = 0, then this 


346 MATHEMATICAL ANALYSIS 


property is preserved under the extension of the functional 
to the specified class of discontinuous functions. Since we have 
Lag”) = aye) for & < y, it follows that in this case 
F(&) Ss Fly), ie. the function F(&) is non-decreasing. 

The function F(z), as a function of bounded variation, can act 


as integrating function for the Stieltjes integral. We then have 
é b 
flztaa(@)] = FE) = f AF (x) = f gag (a) dF (x), (5) 


which holds not only for the characteristic functions of intervals, 
but for all step functions; and since every continuous function 
is the limit of a uniformly convergent sequence of step functions. 
we get on proceeding to the limit 


b 
tll = f ge) dF), 


valid now for any continuous function p(x). The theorem is 
proved. 

III. Let us see how we can generalise this theorem to the case 
of functions of several independent variables. 

In accordance with art. 1] a functional f[~] on the space C of 
continuous functions (for simplicity, of two variables x. y) on the 
square @ Su <b, a <y < bf can be continued over a class of 
discontinuous functions containing the characteristic functions of 
all rectangles. We define the function of two variables 


Fé, n) = fUxta,a, &n] (x, y)], 


where 74,087] (2, ¥) is the characteristic function of the rectangle 
fa Sa@ S&,a Sy Sn) = Dz,,. By the same procedure as in II it 
can be shown that the function F(&,7) is of bounded variation 
and is continuous above (cf. the end of Section 6) and that conse- 
quently it is an integrating function for some Stieltjes integral. 


+ The general case where the functions (2, y) vary over an arbitrary 
bounded closed set / also reduces to this case. The set F can be enclosed 
in a square @ of the given form and every continuous function can be con- 
tinued, without increasing the upper bound of its modulus, into a continuous 
function on Q. Conversely, since every continuous function on @ is also 
continuous on /, we can regard the functional F as defined on the whole 


set C(Q). 


DIFFERENTIATION AND INTEGRATION 347 


Further, since 
f[xta,a,ent@ y)] = Fé n) =f Lt dF (x, y) 


. 
Sy i vet an (t, y) dF (a, y), 


we get by taking linear combinations and by completion with 
respect to the metric of C that, for any continuous function 


b b 
flel =f fe@ y)4F(,y). 


Similar reasoning can be applied to the case of any number of 
variables. 


2. Absolutely Monotonic Functions. 


An infinitely differentiable function / (x), defined on the closed 
interval a <2 Sb(—ow Sa,b S ov), is said to be absolutely 
monotonic if it is non-negative together with all its derivatives 


fM(c) =0 (n= 0,1,2,...). 


Examples of absolutely monotonic functions are positive constants 
and functions of the form e**(« > 0). It is found that if the interval 
fa, b] is infinite, every absolutely monotonic function is a ‘Stieltjes 
combination” of the simple absolutely monotonic functions e**. 
For definiteness we confine ourselves to the half-line — coo <a <0. 


THEOREM 2. (S. N. Bernstein). Hvery function f(x) that is ab- 
solutely monotonic for x < 0 can be expressed in the form 


f(a) =C + f er*d F(a), (1) 
0 


where F(x) 1s some bounded non-decreasing function. 

Alternatively the constant C can be included in the integral if 
the function F(«) is given an additional saltus at « = 0. 

Before we prove the theorem}, we give an account of some of 
the properties of absolutely monotonic functions. Since /"(z) = 0, 


+ After B, I. Korenblum, Advances in Mathematical Sciences, 1951, Vol. 6, 
No. 4. 


348 MATHEMATICAL ANALYSIS 


and is non-decreasing, the limit 6,, = lim f"(a) exists; evidently 
z->— 00 


0) 20, 0, = 0, = «+» = 0. We claim further that «> — o the 
functions {"(#) tend to zero so rapidly that all the integrals 


0 
T,= f(a) fr (a) de, 


converge, and that the value of the integral J,, is 
I,=Mn!, 


where M = f(0) — f(— 0). For, for any » 21 and « <0, we 
have 


por(ey se, [peas x le» (S) - ee], 


so that as 2 - — oo each derivative decreases more rapidly than 
its predecessor, the difference being at least of the first order (in 
powers of a); since f(a/2) — f(z)» 0, we have /"(x) a" > 0 for 
any 7. Hence, integrating by parts, 
= x) a -1 
ae [es Gs ye el 


0 
— a 

[ene +1) (x) dz= a Ye (y 

and the extra-integral term vanishes. Successive integrations give 

us finally 


Ns 
0 


0 
[SP pee eae = [pede = 70) — =) = M, 


0 


- oO 


as required. 
We now turn to the proof of theorem 2. By Dirichlet’s formula 


fa) — f(-00) = fre Jag =< fw —ey yi fo D(x) (8) dE 


ae) fea-2y jo+n (gy dé. 


nt 


~0O 


DIFFERENTIATION AND INTEGRATION 349 


The substitution € = — 7 ¢ gives 


f(x) — f(— ») Z 
—])jrti n 
, ) f me +=) neh) fr+l(—né)ndt 
-a2jn 
= f (a +=)’ dF, (t), 
where ov 
F,() = CO fener) dr. (2) 


0 
We claim that the functions F,,(é) are uniformly bounded for all 
t = 0. For substituting x for — nt in (2) we get 


as +2 


OSF, (4) 


—nt 


sd fhm m) \n\"dn = M 


~ 00 


We define the functions 


+ 
is 
IV 
| 
|s 
Vv 
2 


0, Osta =. 
n 


As n — co these functions tend uniformly in ¢ (é = 0) to the limit 
g(t, 2) = ert, 
We observe that as ¢ > oo the exponential increases to the value 1 
in virtue of the hypothesis x < 0. 
By Helly’s second theorem (Section 6) we can extract from the 
sequence of non-decreasing functions F,,(f) a subsequence which 


converges everywhere; by the first theorem, on the convergence 
of Stieltjes integrals (cf. notes 1 and 2 following it), we have 


font 2) dF, () > f p(t 2) dF). 
0 0 


MA. 12a 


350 MATHEMATICAL ANALYSIS 


Hence for x < 0 
- 
f(x) =f(— 00) + f et dF; 
6 
substituting « for 1/t, we get the required formula (1). 


3. The Map of the Unit Disc into the Right Half-plane 


We shall find here the general form of a function w = f(z), 
analytic on the dise |z| < 1 and having a non-negative real part, 
i.e. a function which maps the disc |z| < 1 into the right half-plane. 
Examples of such functions are the constant « + (8, ~ 20, and 
the function 

el +z 


et! —z 


f(z) = 


erelttz 


<z,<e/Rz 


Fie. 14 


with arbitrary real ¢. For, for a given ¢ and [2| <1, the points 
z, =e! + z, 2 = el — z lie in the closed disc Q of unit radius, 
centre e, on a diameter (Fig. 14). The whole diameter subtends 
an angle 2/2 at the origin (which lies on the boundary of Q); the 
segment contained between z, and z, therefore subtends an angle 
« < 2/2. Hence |arg f, (z)| =| arg z, — arg 2| < 2/2 and it follows 
that Ref (z) 20. 

It is found that every function w = f(z) that is analytic on the 
disc |z| < 1 and maps it into the right half-plane is a “Stieltjes 
combination”’ of the given simple functions; we have the following 
theorem: 


DIFFERENTIATION AND INTEGRATION 351 


THErorem 3. (G. Herglotz). Every analytic function on the disc 
lz] < 1 with a non-negative real part can be expressed in the form 


fle) = ip + f a any, Q) 


where B ts a real number and F(t) a non-decreasing function. 

Proof.t The analytic function f(z) is well known to be expressible 
in the disc |z] Sr <1 in terms of the boundary values of its 
real part w(z) by Schwarz’ formula 


Qa 


(os {Ss = lr eit) dt +B. 


rel — 


This integral can be written in the form 


22 


je) =f BSE arin ee. 


ret — zg 


where 
t 
F b=s-f re) d 
0 


is a non-decreasing function of t. By the mean-value theorem for 
harmonic functions we have further 


Q7 
1 1 
F,(t) < F,(22) = ay ure™ae = 550), 
0 


so that the family of functions F,(é) is uniformly bounded for all 
r <<]. As r > 1 the functions (r e# -- z)/(r e# — z) (|z| < 1 fixed) 
converge uniformly in ¢ to the function (e# + z)/(e* — z). By 
theorem 2 of Section 6 we can extract from the sequence F, (t) 
(r -> 1) a subsequence which converges everywhere to some non- 


+ After N.I. Akhiezer and I.M. Glazman (Theory of Linear operators in Hil. 
bert Space, Ungor, New York, 1961). For Schwarz’ formula, see e.g. A. I. Mar- 
kushevich, The theory of analytic functions, Hindustan Pub., Dehli, 1963, ch. 6. 


12a* 


352 MATHEMATICAL ANALYSIS 


decreasing function F(t); applying theorem 1 of Section 6, we get 


£ 


=. f 
rag tt es 


2n : 


m Qn 
je) =f “arn +ip>fs 
0 


rel +2 e! 
0 
as asserted. 
Note. A constant « > 0 can also be represented by formula (1); 
for this it is sufficient to put F(f) = at. 


8, DIFFERENTIATION OF FUNCTIONS OF SETS 


1. The most general formulation of theorems on the differentia- 
tion of set functions is achieved by abstracting from the properties 
of the primary set with respect to which differentiation is carried 
out. 

Let there be given some abstract set X and some family L of 
real functions f(x) defined on X. The family ZL is assumed to be 
a linear space under the usual operations of addition and scalar 
multiplication, containing every constant and together with each 
function f(x) its modulus |f(x)|. It follows that f € ZL implies 
fe L,f- € Land} € L,g € Limplies max (f, g) € L, min (f, g) € L. 

Further let an “integral” I be given on L, in other words, a 
linear functional with the properties (a)—(g) listed below: 

(a) Ip = 0 whenever (x) = 0. 

This implies that Ig SI y for p Sy and that |Ig| SJ ({@)). 

(b) If the monotone increasing sequence :p,, converges to the func- 
tion » and the I y, are bounded, theny € LandIg =limIg,. 

A function g(x) for which I(|~|) = 0 is said to be f-equivalent 
to zero and a set on which it is non-zero is said to be a set of J- 
measure zero. 

(c) Any function o(x) that is non-zero only on a set of I-measure 
zero belongs to the space L, and Ig = 0. 

(d) The space L, of classes of I-equivalent functions is a complete 
normed space with the norm 


lel = Z(l¢)). 


From (d) it can be inferred that the set of » € Z for which 
jg|?€ L constitutes a complete Hilbert space with the scalar 
product 


(yp, y) = Lp, yp). 


DIFFERENTIATION AND INTEGRATION 353 


(e) There exists a set Ly of bounded functions which is dense in 
the space L,. 

The set of all functions m € Z; that are limits of increasing 
sequences of functions gy, € [, is denoted by L*. 

(f) Every function p € Lis the difference of two functions belonging 
to the class L*. 

Since we have sets of measure zero, it is natural to define the 
concept ‘‘almost everywhere’. For instance, a sequence of func- 
tions converges almost everywhere if it converges at all points 
of X except perhaps on a set of measure zero. We shall call the 
limit of a sequence of functions ¢, (x) € L; which converges almost 
everywhere an J-measurable function. 

(g) The product of two I-measurable functions is an I-measurable 
function: the quotient Ip is an I-measurable function if the deno- 
minator y is I-measurable and vanishes at most on a set of I-measure 
zero. 

Just as in Chapter IV it can be shown that an J-measurable 
function bounded in absolute value by an J-summable function 
(i.e. by a function y € £7) is itself J-summable. In particular every 
bounded J-measurable function is J-summable. 

Examples of the class L, are afforded by the spaces of functions 
which are integrable with respect to the measures of Lebesgue or 
Stieltjes over a closed interval on the line or a region in n-dimen- 
sional space. 

2. Our problem is to compare two integrals J, J which satisfy 
the conditions postulated. It is assumed that the aggregate L, 
of functions y integrable in the sense of the integral J (briefly I- 
integrable) and the aggregate L, of functions py integrable in the 
sense of the functional J (J-integrable) are defined on one and the 
same set X and have an intersection Ly (functions which are both 
I- and J-integrable) dense in L; with respect to the metric of L, 
and in £; with respect to the metric of L;. 

We shall say that the integral J is absolutely continuous relative 
to the integral J if for any y€ I, we have Ty = 0 whenever 
yp 20,Jyp = 0. 

For example, let I be the Stieltjes integral over the closed interval 
[a, 6] with an absolutely continuous non-negative integrating func- 
tion F(x), and let J be the Lebesgue integral over the same interval. 
As the set Z, we take the set of all bounded measurable functions. 
If pEL,. py 20, Jy =0, then, as we know, the function p 


354 MATHEMATICAL ANALYSIS 
vanishes almost everywhere; but then also 
b b 
ly= [pak a | yp F(x) dx = 0, 
a a 


ie. the integral J is absolutely continuous relative to J in the sense 
just explained. In this case, as we saw in Section 5, the J-integrable 
functions y are characterised by the property that the product 
of a function gy by some fixed J-integrable function g (== F’(#)) 
is again a J-integrable function. 

An analogous result holds in the general case; we shall prove 
the following fundamental theorem: 


THEOREM (Radon-Nikodym). A necessary and sufficient condi- 
tion for the integral I to be absolutely continuous relative to the integral 
J is that there exist a J-integrable function wy, such that its product 
with any I-integrable function y is again a J-integrable function and 


Ip =JS(yy)- 


Proof}. The sufficiency of the condition is obvious: if y € Ly, 
yp 20, Jy = 0, then the set on which p > 0 is of J-measure zero: 
by axiom (c) the function py, € L, J (w wo) = 0, and consequently 
Ly = J(wy) = 0. We must show that the condition is necessary 
for the absolute continuity of J relative to J. 

Let us consider first the case when I SJ, ie. Ip < Jo for 
any g(x) = 0, » € Ly. In this case every set of J-measure zero 
will be of J-measure zero. Every J-measurable function, as the 
limit of a J-almost everywhere convergent sequence g,, € Lo, will 
also be J-measurable. 

We claim further that in this case the integral J can be defined 
over all functions gm € L,. 

For we can form a sequence g, € Lg which converges to a given 
function gm € L,; in the metric of £2; such that 


J({gn — p|) > 9. 
But then 


\L@n = Ln | s L(\@n => Ym) — J (On —. @m/\) >0, 


so that the sequence ¢, is fundamental with respect to the norm 
of L;. We put Ig = lim I ,. It is easily seen that this definition 
is unique and that for py = 0 it is always the case that [gy <J 9. 


+ After F. Riesz. 


DIFFERENTIATION AND INTEGRATION 355 


We claim moreover that the functional J, extended as indicated 
over the whole space L;, is a bounded functional on any space 
Lip = 1) with respect to the norm of that space. For we have 
for any g € LF 

T(lg|’) SJ (g|P), 


so that the functional J is bounded on the unit sphere of the space 
L},. We shall consider only the value p = 2. In accordance with the 
theorem on the general form of a bounded linear functional in 
Hilbert space (Chapter V, Section 2) there exists a function g € L*, 
such that for any g € L? 

Ilo =J(q¢). (1) 


We claim that the function g is included between the bounds 
0 g(x) <1 almost everywhere in the sense of J-measure. For, 
putting y == g~ in (1), we get 
Ig-=Jgg) =J(-G) = —J(9-)"), 
and since Ig- = 0, J((g-)*) = 0, we have 
Tg =J((g-)) = 0, 


so that the set on which g~ (x) = 0 is of J-measure zero. Thus we 
have g(x) = 0 almost everywhere in the sense of J-measure. The 
second inequality g(z) < 1 is obtained by substituting J—J for the 
functional J in the foregoing argument. 

We have established equation (1) for all gp € L2,. We shall now 
carry it over to all functions g € L;. It is sufficient to consider 
functions of the class Lj. By hypothesis every function » € LF 
can be expressed as the limit of an increasing sequence of functions 
Gn € Ly. For the functions gy, equation (1) holds: 


LQ = J(g fn). 


The relation g, 7 @ implies that g gy, 7g and in virtue of pro- 
perty (b) the function gy belongs to L, so that 


J (gg) = lim Jig g,) = lim Ig, = Tg. 


Thus for any » € L; we have gg € Ly and J(g gy) = I y. The con- 
verse result also holds in the following form: if y is J-measurable 
and gp € L;, then gy € L; and Igy =J(g¢g)}. For the proof we 


+ If we include axioms connecting measurable functions with measurable 
sets, as in the Lebesgue case, the requirement for y to be J-measurable 
becomes superfluous (cf. Section 5, art. 3). 


356 MATHEMATICAL ANALYSIS 


put my = min {|y|, N}. The function yy is bounded and J-measur- 
able and therefore [-summable, and by what we have proved 
Loy =JS (gon). Since J(gy)y SJ(g|q|), the numbers Igy are 
bounded; it follows that || = lim gy belongs to L;, as required. 

We have proved the Radon-Nikodym theorem in the case < J. 
We proceed now to the general case. 

Let J, J be arbitrary non-negative functionals satisfying the 
conditions of art. 1 with I absolutely continuous relative to J. We 
define the functional K = I + J. Snce J = K,J SK, there exist 
functions k, 1 belonging to the space L?, and included between the 
bounds 0, 1, such that for any g € L, 


Tp = K(k) (2) 
and for any p € L; 

Jy =K(ly). (3) 
For » = yp € Ly we get 


Igp+Jqp=Kop=K[(k +) ¢], 

and it follows, since L, is dense in Lx, that for any py € Lr 

Ko = K(k + Iq). 
We claim that almost everywhere in the sense of K-measure 

k+l=1. (4) 

To see this we take as g the characteristic function e, of the set 
Ey = {k +1< 1). We get 

Key = K(k +1) e] 


or 


K{(L — (e+ 1) e] =0. (5) 


But the function [1 — (& + 1)] eg is non-negative; in virtue of (5) 
it vanishes almost everywhere in the sense of K-measure. Hence the 
inequality k + 1< 1 can hold at most on a set of K-measure 
zero, and similarly for the inequality & + / > 1. Thus (4) holds 
almost everywhere in the sense of K-measure, as asserted. 

It follows that for any J-summable function y the product 
(1 — k) wis K-summable and we have 


Jy = K[(l — ky). (6) 


Equation (6) holds whenever y is J-measurable and (1 — k) wp is 
K-summable; as we saw above, it follows in this case that p is 


DIFFERENTIATION AND INTEGRATION 357 
J-summable. Taking as g in (2) and as w in (3) the characteristic 
function e of the set Z on which k(x) = 1, we get 
le=K(ke) = K(e), 
Je = K((1 — k)e) = K(0) =0. 


We see that the set Z has J-measure zero. Since J is absolutely 
continuous relative to J, we have Je = 0 and K e = 0; thus the 
set Z has zero J-measure and K-measure. Now let m be any function 
in the space L,. For the function y given by the condition 


ko=(1—k)y 


k vi 
hi eee = Lee ee 


we have 


The function ky€ Lyx here is K-measurable. The coefficient 
1/(1 — &) is also a K-measurable function since the domain on 
which the denominator vanishes has K-measure zero. Hence the 
function wy is J-measurable and therefore J-measurable. It follows 
since (1 —ky = ky € Lg, that p is J-summable and 


Ip =K(kp) = K((l— by) =Jp=I(p=4 0). (7) 


Putting y = 1 in (7), we find that the function py) = k/(1 — k) 
is J-integrable, which completes the proof of the Radon—Nikodym 
theorem. 


Concluding Remark 


The connection between differentiation and integration described 
by us in Sections 1-3 was first discovered by Lebesgue (1902) and 
constitutes one of the most important achievements of the Lebes- 
gue theory of integration. Somewhat earlier (1894) T. Stieltjes 
(Dutch mathematician, 1856-1894), while engaged in a study of 
the theory of continued fractions, arrived at a new concept of the 
integral, now known as the Riemann-Stieltjes integral. With the 
work of F. Riesz, who in 1909 obtained with the aid of the Stieltjes 
integral the general representation of a linear functional on the 
space of continuous functions, the Stieltjes integral began to 
infiltrate widely into very diverse branches of analysis; the general 
theory of measure developed hand-in-hand, progressing gradually 


358 MATHEMATICAL ANALYSIS 


from the line to many-dimensional space and then to abstract sets. 
In just such general form measure theory and integral theory 
proved to be useful tools in problems of higher analysis such as 
the harmonic analysis of groups, the theory of random processes, 
dynamic systems, and others. Recommended literature: P. R. Hal- 
mos, Measure Theory, Van Nostrand, N. Y. (1950). 


CHAPTER VII 


THE FOURIER TRANSFORM 


1. ON THE CONVERGENCE OF FOURIER SERIES 


1. The development of functions in Fourier series is useful in 
many analytical problems. In the simplest case, that of the closed 
interval —z <x <a, this development is given in complex terms 
by an expression of the form 


(2) = my Ay CM, (1) 


m=-CO 


The development in Fourier series appears more frequently than 
other possible developments for the following reasons. First, the 
functions e'”* are orthogonal for distinct m [in the metric of the 
complex Hilbert space L,(— 2, 2)], so that (1) is a development 
over an orthogonal basis. Secondly, the functions e'”* = w,,(x) 
remain well-behaved analytically (entire analytic functions) when 
continued with period 27 over the whole axis, and satisfy simple 
functional equations such as 


Um («x + &) = Um (x) Um (€) 
or 
ul (0) = im tm). 


Thirdly, the coefficients of the development (1) can be calculated 
from the simple formula 


1 . : 
an = 5 | plgyenimeds. 2) 


Various formulations are possible for the question as to whether 
the Fourier series (1) converges. We can enquire first of all whether 
it converges at a given point x. Then we can consider its conver- 


359 


360 MATHEMATICAL ANALYSIS 


gence in different norms. We shall narrow down the second question 
in art. 2; here we concern ourselves with the first question, viz the 
convergence in the usual numerical sense of the Fourier series at 
a particular point. 

We prove the following theorem, which gives a sufficient condi- 
tion for the convergence of the Fourier series (1) to the value ¢ (x) 
at a given point 2. 


THEOREM 1. If p(x) is a summable function and the integral 


converges, then the partial sums of the Fourier series of p(x) converge 
at % = aq to the value — (x9). 

The condition that the ratio [m(x +t) — m(x)]/t be summable 
for |t| <6 is known as Dini’s condition. It is satisfied, for example, 
if p(x) satisfied the Lipschitz condition of order x: 


|o(x +t) —p(x)| SCltl* (0<a« <1). 


In particular if y has a finite derivative at the point 2 (or even 
just finite derived numbers, Chapter VI, Section 1), the Lipschitz 
condition of order 1 is satisfied and hence the numbers s, (x) con- 
verge to p(x). 

Proceeding to the proof of the theorem, we begin by transforming 
the expression for the partial sum s, (x) of the series (1). We have: 


4 


n : i are . 
S(t) = JF dy otk? = 5 — Ss} p (é) efk 8) dé. 


We make the substitution x — € = —t. We shall suppose that 
the function g(&) is continued from the interval [—z, a] to the 
whole line as a periodic function of period 27; then we can replace 
the limits —z, 2 by the new limits of integration —az —x,2 — &, 
giving 
n 1 : n 
Dy ay RT = ae p(x +t) Set dé. 
on 1% a 


THE FOURIER TRANSFORM 361 


Summing the geometric progression, we get 


ent _ e-ifvt+ht eltst _ e-ifnt+gt 


n 
ikt — = 
Ze Tash 


ee Pay od 


e?—e 2 


__ sin(n + 3)t 


sin t/2 
and thus 
1 i sin(n + 1¢ 
8y (2) =scfowso ee — dat. (3) 
-2 ce 
The function 
1 sn(n+ dt 
Pall ieee 
2 


is called Dirichlet’s kernel. If we put y(x)=1, then evidently 
S,() = 1 for any n; in this case formulae (3) gives 


x 


sy fa 1, 
ae din” 
oa 3 


The difference s, (x) — y(x) can now be written in the form 


sin (rn + 3)¢ 


snl) — g(t) = = [Ipc + 9 — oa] at. (4) 


sin > 

We want to ascertain under what conditions s,,(x) tends to (x) 
or, what is the same thing, under what conditions the integral (4) 
tends to zero. For this we prove a lemma: 

Lemma 1. If g(x) ts a summable function on the closed interval 
[a, b] then the integrals 


b 6 
f pl) sind cde, { p(x) cosAxda 


tend to zero as 1 > ov. 


362 MATHEMATICAL ANALYSIS 


Proof. First let y(%) be the characteristic function of an interval 
(c, d) < [a, b]. Then 


b d 
cosAc ~ cosid 


fol)sin dade = fsindwde = 5 >. 


“a c 


Any step function (x) is a linear combination of charactcristic 
functions of intervals, hence the assertion of the lemma also holds 
for step functions. If now g(x) is an arbitrary summable function, 
then for a given « > 0 we find a step function h(x) such that 


b 
flee —h(«)|da< 5 
and A, > 0 such that for |A| > A, 


E 
=e 


b 
f h(x) sinAada 5 


Then for these values of 2 


b b b 
fi pe)sin Beda < f |p) —h@)| de +) [r(x)sindade, <e, 


which gives the required result. The proof with cos A a as multiplier 
is similar. 
In particular we get the result: 
The Fourier coefficient a, of any integrable function (a) tends to 
zero aS N —> ov. 
We turn now to the integral (4) 
1 [ 1 ¢ ‘ 
5 [tole +9 — 9) 


sin (n +- 4)¢é 


dé = T,. 
sin — 


2 


Let us suppose that for a given value of 2 the summable function 
y (x) is defined and finite and the ratio 


Bee OS pe): 
t 


THE FOURIER TRANSFORM 363 


is integrable with respect to é over |t| <6 and consequently over 
the whole interval —x <t <a. Then the function 


p@+t)—gl(x)  g(e«+i—pe@) ft 
sin : sins 


is integrable between the limits —2 < ¢ < a and lemma 1 can be 
applied to the integral [,,; we thus get the required result that I, 
tends to zero as m + oo, and theorem 1] is proved. 

Note. In a number of cases Dini’s condition can be weakened, 
but it cannot be totally rejected if convergence of the Fourier 
series is to be preserved. There exist even continuous functions for 
which the Fourier series diverges at isolated points (cf. art. 3). 
A.N. Kolmogoroff has constructed a summable function the 
Fourier development of which diverges at every point.t There has 
so far been no solution of the problem posed in 1915 by N. N. Lu- 
zin: is the Fourier development of a function f € L, convergent 
almost everywhere ? 

A condition for the uniform convergence of a Fourier series 
can be formulated in the same terms. 

Tuerorem 1’. If the function v(x) is bounded and summable on 
some set HE <[—2x,2] and Dini’s condition is satisfied uniformly, 
i.e. corresponding to any « > 0 there exists 6 > 0 such that for all 
cel 


0) 

pe +) ~9@la,—, 
|¢| 

-d 


then the Fourier series of p(x) converges to p(x) uniformly on the 
set E. 
For the proof we use a strengthened version of lemma 1. 
Lemma |. The relation 


b 


lim f #() si - 
een syed 


is realised uniformly on any set B of functions f(t) summable on 
[a, b] that is compact with respect to the metric of L,(a, 6). 


+ Ch A. Zygmund, Trigonometric Series, Stechert, N.Y. (1935), Chapter 8, 


364 MATHEMATICAL ANALYSIS 


For, given ¢ > 0, we can construct in Z,(a, 5) a finite e/2-net 
for B; let this consist of the functions f, (6), .... fm(é). By lemma 1 
we can find A, such that for A > A, 


<= Gi =1,2, 5m). 


t)sinAtdé 


If now f(t)€B is any function and for some j we have 
f(t) — fi < ¢/2, then for A > 2, 


b 
= [|i — flat + 


b 1 
Jiiosindtae 


<i +s 
a a 
as required. 

We proceed to the proof of theorem 1’. Given ¢ > 0 we find 
6 > 0 such that for all « € £ 


fieepaeel Nae t, (5) 
3 
Then 
Lf ple +8) — olay) — hy at 
«+t t)—(x —-sinn (+ ht 
an(z) — 9) = 5— [ # ; pz) sale ( 
-n 2 
é n ~¢é 
1 1 i 
-sz/+azlael: 7 
-6é 6 “fn 
t 
Since =—~—5 < 1 and in virtue of (5), the first term does 


22 sin t/2 
not exceed ¢/3 for any « € #. To obtain bounds for the remaining 
terms we show that the functions 


1 o(« +8) — o(z) 
as ae 


2 


as functions of ¢ € [6,2] with parameter x € FH, form a compact 
set B in the space L, (6, 2). Let x, € # be any sequence of points; 
we can suppose that the x, tend to some point #) and that the 


THE FOURIER TRANSFORM 365 


values g(z,) tend to some number cy. Then in the metric of 
L,(6, x) we get p(x, + t) > p(% + t), which means that 


Ife.) — fol) Soe 


S| (ay +t) = Y(%y + #) |] 
| sing | | 

oo || Pn) — &p f 

5 . | f 

sin — | 
1 l 
S507 7 pen +0 ~ ply +9 

sin > 


+ 2/9 (an) — eo|] +9, 


i.e. the sequence f x, (t) is fundamental in ZL, (6, z). 
Thus the set B is compact. By lemma 1’ we can find 4, such 
that for n > Ay 


sin -—— 


ef 22am ten nosy <$ 
3 1 


for alla € H. The last term in (6) can be treated iletiy, We see as 
a result that for sufficiently 82 (xe) — p(x)] 
becomes < «¢ for all x € H#, which proves theorem 1’. 

CoroLuaRyY. If all the derived numbers of a summable function 
v(x) are bounded by some constant K on an interval [«, B] <[—2, x], 
then the Fourier series of ~(x) converges uniformly on any closed 
interval [x', B’} such thatx <o' <p’ < B. 

For we have for x €[«’, B"] and all |f} < min (6 — f’, a’ — x) 

lp(w +t) — p(#)| SKE 


so that g(x) is bounded on [a’, 8’] and Dini’s condition is satisfied 
uniformly. 

For instance, if a summable function g(x) vanishes on the 
closed interval [«, 2], its Fourier series converges to zero uniformly 
on any closed interval [a’, B’] interior to [«, 6]. 


Problems. 1. If a function g(x) is continuous at # = x and is of bounded 
variation in the neighbourhood of a, its Fourier development converges 
to (x5) for x = a. 


366 MATHEMATICAL ANALYSIS 


Hint. It is sufficient to consider a non-decreasing g(x). Use the second 
mean value theorem: 


a h 

fegdt=o(h) fg)dt 0<é<h, 

0 é 

nh 
sin t : ; 
and the fact that ib a dt is uniformly bounded. 
n§ 
2. Without the assumption that g(x) is continuous in problem 1, the 


Fourier development converges to 1/2 [p(x + 0) + p(x — 0)}. 
Hint. Since the Dirichlet kernel is even, 


feet) Dy) dt=f [p++ eet] D,() dt. 
ats 2 0 


2. We now turn our attention to questions concerning the con- 
vergence of the Fourier series 


oe + 
a aye" (1) 
ma-~ OO 

in the norms of different functional spaces. We begin by recalling 
some well-known facts in connection with the convergence of 
Fourier series. It is proved in elementary courses of analysis that 
every function g(x) that is continuous on [—2, 2] is piecewise- 
smooth, and satisfies the condition g(x) = @(—2) (thus ensuring 
the continuity of the 2-periodic continuation of g(x) over the 
whole axis) has a Fourier series development which converges 
absolutely and uniformly, in particular, that is, in the norm of the 
space C'(—z, 2). On the other hand, we saw in Chapter V that 
every function g(x) square-summable on [—2, a] is the sum of a 
Fourier series (1) which converges to g(x) in the quadratic mean 
[i.e. in the metric of the space L,(—2z, z)]. 

Series in the functions e’”* can also be constructed and studied 
in other normed spaces of functions on [—z, 2]. There are too 
many such spaces for us to consider them all; we shall confine 
ourselves to an important class of spaces, which contains the majo- 
rity of those used in analytic applications. 

Definition. A normed space & of functions defined on the closed 
interval [—-z, z] is said to be a homogeneous functional space if the 
following conditions are satisfied. 

(1) All functions g(x) € R are summable and the convergence 
Qn(x) > g(x) in the norm of FR implies the convergence 


THE FOURIER TRANSFORM 367 
@n(%) > v(x) in the norm of L,(—z, 7), ie. the relation 


f lea(x) — 9(@)| da +0. 

(2) If a function (x) is continued over the whole axis of « as a 
periodic function of period 27, then for any real h the functions 
v(x + h)—the displacements of y(x)—will be well-defined. All 
such displacements are required to belong to the space R together 
with the functions w(x) and the norm is to be invariant under 
translation: 

lp + h)| = lp) for any 4. (2) 

(3) The space & contains all trigonometric polynomials—linear 
combinations of the functions e'”*, and their aggregate forms a 
set everywhere dense in Rf. 

Conditions (1)-(3) are satisfied by many of the functional spaces 
with which we are familiar. They hold for example in the spaces 
L,(—2x, a) for all p 2 1. In the space C(—a, x) of all continuous 
functions on [ —z, zt] condition (2) is not satisfied since a continuous 
function v(x) with y(x) = g(—2) ceases to be continuous as a 
periodic continuation over the whole axis and (a + x), for 
example, is not contained in C( — x, 2). But if in place of the 
whole space C(—2, 2) we consider only the subspace G (—2, 2) 
determined by the condition y(—z) = g(a), conditions (1)-(3) 
will all be satisfied. The analogous subspace D,(—a, z) of the 
space D,,(—z, 2) determined by the conditions y(—2) = (a), 
y' (—2) = g' (a), .., p&™ (—2) = p™ (a) also satisfies conditions 
(1)-(3). 

We shall establish first of all the form of the Fourier series ina 
homogeneous space R. Strictly speaking we require only condition 
(1) here. 

Lemna 2. If for some function ¢ (x) which belongs to a homogeneous 
space R the development 


@ (x) = = Qin elma (3) 


converges in the norm of R, then the usual Fourier coefficients of 
y (x) are given by 


1 Pp ; 
an = 5 { plye-ime ade (4) 


368 MATHEMATICAL ANALYSIS 


Proof. By condition (1) we have 


| 


Ht 


Sa, eh — p(x) da >. 


mak 


But then for every fixed k 


, en thu Oy & Ema e- tha (x) dx > 0, 
. y 


“Ah 


and hence in virtue of the orthogonality of the functions ej, over 
[ —%, qt] 


- x n 
| gy (2) en the da = lim en the > a, OF Mee dz = ay," 20, 
-n n> Oly =n 


which yields formula (4). 

The following property of functions which belong to a homo- 
geneous space plays an important part in the subsequent dis- 
cussion. 

Lemma 3. Every function ¢ (#) that belongs to a homogeneous space 
R is continuous in the norm under displacement: to any e > 0 
corresponds 5 > 0 such that for |h| < 6 we have | g(a + h) — (a) 
<& 

Proof. We denote by @ the totality of functions (x) € R which 
are continuous in the norm under displacement. Evidently the 
set Q is a subspace of &: it contains together with functions 
(x), p(x) every linear combination of them. We shall show that 
@ is closed in the norm. Let ¢, > yp, where », € Q. For a given 
é > 0 we find x such that |g — @,|| < ¢/3. Then we choose 6 in 
the condition for continuity of gy, (x) under displacement so that 
lan (a +h) — g(r)! < &/3 for |k| < 6. By condition (2) we shall 
have at the same time |m(v + kh) — 9, (w + A)l| < e/3. Hence 


p(w + kh) — p(x)| = |p + h) — pale + h)| 
+ | pnle +h) — gn(x)|| + |eale) — pl) <e, 
so that the function (x) is also continuous under displacement. 


Finally we observe that each of the functions e'”* is continuous 
under displacement since 


perm (ath) emer —_— Jetme 1} lef] -- 0 for [RI >o0. 


THE FOURIER TRANSFORM 369 


Thus the set @ contains all trigonometric polynomials and is 
closed; it follows from condition (3) that Q = R, and the lemma 
is proved. 

We now formulate our basic problems. 

A. We are given a function p(x) contained in a homogeneous 


space R and its Fourier coefficients are calculated in accordance 
Cc 


with formula (4). Does the Fourier series DS) a, e'™* converge to 
~oo 
g(x) in the norm of the space R? 
The answer to this question is in general negative: there exist 
homogeneous spaces, moreover ones with which we are very 


familiar such as C (—2, 2) and L,(—2, 2), in which the Fourier 
series is found to be divergent for some functions in every case. 

Such an answer evokes a natural desire: 

B. To find a procedure, as simple and uniform as possible, which 
will allow a function (x) to be deduced from its Fourier series 
despite the possible divergence of the latter. 

In attempting to fulfil this desire we obtain the following result: 


THEOREM 2. The arithmetic means 


8q(%) -+ 8(%) + + + Sp_1 (2) 


Gp (x) = p 


of the partial sums 8,(x) = D/ @», et* of the Fourier series of any 
function y (x) € R converge in the norm to (x) as p—> ox. 

3. In this paragraph we shall satisfy ourselves that the spaces 
C(—2,2) £,(—2, 2) really do contain functions whose Fourier 
series do not converge in the norm of the space. We begin by 
showing that the integrals 

Dn = { |Dn(t)| at, 
1 sin(n + d)t 
and a a 
sin > 


2 


increase without bound as n - oo. The graph of the function 


370 MATHEMATICAL ANALYSIS 


D,,(t) is given in Fig. 15. At points ¢ for which (x + 1/2) ¢ = 2/2, 
32/2, ...,.(k + 1/2) 2, ..., the quantity |sin (n + » 1/2) #| is equal 
to unity, while in the intervals 


A 
\(m + 3) t — (e+ a)al<z (k = 0,1, 2, ...) 

it exceeds 1/2. In the same intervals sin ¢/2 does not exceed 

(k + 3) + 


1 
37 (k+1)x 


Se FS) In 


Fie. 15 


Hence the integral of |D,,(t)|, taken only over the intervals 
specified, exceeds 


Qn n 3 Qn BY 
3(n +4) bse Seta Gore 
2n 


In this fact lies the primary reason why the Fourier series does 
not, in general, converge in the spaces C'(--2, 2), L,(-—a, 2). Let 
us consider on the space C'(—z, x) the functionals 
sin (n + 4)¢ 


Pigl=s— [ 9 de= [ p) Daltyar, 


sm 3 
which give the value of the partial sum s, (x) of the Fourier series 
for p(x) at x = 0. Each of the functionals ®,,(g) is bounded on the 


unit sphere of the space Cc (—2, 2), but they do not have a common 


THE FOURIER TRANSFORM 371 


bound on this sphere. For if we take as p(é) continuous functions 
which approximate to +1 on intervals where D,,(é) is positive 
and to —1 on intervals where D,,(é) is negative [such functions 
do not exceed J in absolute value and therefore belong to the 
unit sphere of C'(—z, z)], we shall obtain arbitrarily large nume- 
rical values of O(~) (by taking n sufficiently large). We claim 
that there exists a function g(x) for which the values of the func- 
tionals @, (qo) are unbounded (so that the corresponding Fourier 
series diverges for x = 0). This is a consequence of the following 
general lemma of functional analysis: 

Lemma 4. If a sequence of linear functionals ®, on a complete 
normed space R is unbounded on the unit sphere |p| <1, there 
exists an element Wy for which the values ®,, (qo) are unbounded. 

The proof of this lemma will be found in the Supplement, 
Section 2 (p. 465). 

Hence, there exists a continuous function whose Fourier develop- 
ment is divergent at x = 0. 

With minor technical modifications, the same argument can be 
carried through for space L,(—z, 2). We assume that the partial 
sums S,, gy of the Fourier development of any y € L, converge in 
the norm to the element g. An even weaker assumption can be 
made, namely that the numbers || S,, y|| are bounded for any y € Ly. 
We can now assert that the numbers ||S, | are bounded by the 
same constant K for all ||| <1. If this were not the case, we 
could apply the same lemma of functional analysis and find an 
element q such that the numbers |S, @q|| are unbounded. Now 
let (x) be any bounded measurable function, say not exceeding 1 
in absolute value. We have the inequality 


7 


Lf Sn (a) ele) da| S J |S. p(x)| de =|S,p] SK. 


Introducing the explicit expression for operator S,,, we get 


f fee + t) Dy(t) u(x) dt da 


-ma— 5 


we Tt |] 
f fe@Diw@ — t) ple) de dt) = K. (1) 


—t— 


But it is easily shown that this inequality cannot be fulfilled for 
all n, any y € LZ, with |g <1 and any measurable u(x) with 


372 MATHEMATICAL ANALYSIS 


ju(«)| <1. For we can write (1) in the form 


fo Malar 


bes | 3 


<K, (2) 


nm 


where M,,(i) = { D,,(« — t) u(x)de is obviously a continuous 


function of t. We put g(t) equal to 1/2¢ for |t| <« and 0 outside 
this interval; we get ||m|| = 1 and the integral in (2) becomes the 
mean of the function M,,(t) over the interval |t| < «. Proceeding 
to the limit as ¢ > 0, we get 


| 31, (0) | =| [ Dt) mle) del = K. 


We have not yet defined the function u(x). Let us put it equal 
to +1 for D, (x) > 0 and —1 for D,,(x) < 0; then we get 
a 
f\Di@)|de sk. 
-m 
But we have already shown that such an inequality cannot hold 
for all n. Hence there exists an element py € L, such that the partial 
sums 8, Py do not converge to My in the norm of L,(—x, 2). 

Of course our results do not preclude the convergence S,, p > ¢ 
from holding for every element in particular homogeneous 
spaces. But in this event the property is peculiar to the space 
under consideration. We know, for example, that the space 
L,(—2, 7%) has this property. There is also a theorem due to 
M. Riesz which establishes an analogous property for any space 
Ly(—2, 7%) with p > 1f. 

Note. Theorem 2 ceases to hold if we do not assume that space 
RF is homogeneous; it can happen, moreover, that no linear com- 
binations of partial sums of the Fourier development of p(x) con- 
verge to g(x) in the norm of £. 

We take as an example the space F of functions g(x), continuous 


for |2| <5 , belonging to £, for |x| < m and with the norm 
H = max ay + ° 
toll ee lel )| Jle@ az. 


+ Cf. A. Zygmund, Trigonometric Series, Stechert, N. Y. (1935), Chapter 7. 


THE FOURIER TRANSFORM 373 


This space obviously satisfies conditions (1) and (8) for a homo- 
geneous space, but does not sar condition (2). The function 


(x), equal to + 2/4 for |x and —z/4 at the remaining 


|=5> 

2 
points of the interval [—2, 2], belongs to R and has the formal 
Fourier development 


1 ] 
cos & ee a + 008 5x so 


All the terms vanish at # = + a Hence ye linear combi- 
nation of partial sums also vanishes at x= +=. Since (x) 
itself is equal to —1/2atx=4+—> 5 , whilst convergence in the 
norm of R requires in particular uniform convergence in the 


ma 2 
closed interval |-F. $|: it is clear that there can be no 


linear combinations of partial sums of the Fourier development 
that converge to g(x) in the norm of space R. 

4, Our primary device in proving theorem 2 will be the integration 
of continuous abstract functions with values in a normed space R. 
In this paragraph we shall give the relevant definitions and essen- 
tial elements of the theory. 

Let f(t) denote an element of a complete normed space R, which 
depends on a real parameter t, or what is the same thing, a function 
of the parameter ¢ with values in the space R. Such functions are 
said to be abstract. We shall say that f(t) is continuously dependent 
on the parameter ¢ at £ = t if whenever ¢ > + 

If) — fa)| +0. 
An abstract function /(f) which is continuously dependent on ¢ at 
any ¢ = tin the closed intervala <1t < b is said to be a continuous 
abstract function of t on [a, 6). 

The following propositions, which are the natural generalisations 
of well-known elementary theorems of analysis, are easily proved 
by means of the usual arguments using the compactness of the 
closed interval: 

(a) An abstract function f(f), continuous on the interval [a, 5], 
is bounded in the norm, so that | f(é|| < I for allt. 

(b) An abstract function f(f), continuous on the interval [a, 5], 
is uniformly continuous on it: for any ¢ > 0 there exists 6 > 0 
such that ||f(é') — f(é’)|| < ¢ whenever |t’ — t'’| <6. 


Ma. 13 


374 MATHEMATICAL ANALYSIS 


(c) A sequence of abstract functions f,,(¢) is said to converge to 
an abstract function f(t) uniformly on [a, 6] if for any « > 0 there 
exists N = N(e) such that for n > N 


max | fn(é) — {| <e. 


The limit f(é) of a uniformly convergent sequence of continuous 
functions f,,(¢) is also a continuous function. 

We define further the Riemann integral of a function f(t). Let z 
denote a partition a = t) <t, < «+ <#, = 6b of the interval [a, 6] 
and let At; = t;,, — t;. We shall call the quantity d(x) = max A t; 
the parameter of the partition x. We form the integral sum 


n-1 
Sa = D(A. (1) 
j=9 


The quantity S, is evidently an element of the same space 2. We 
claim that when the partition ~ is indefinitely refined, i.e. as 
d(x) > 0, the integral sum S, tends to a uniquely determined 
element If of R, said to be the Riemann integral of the abstract 
function f(t). 

We prove the existence of the integral. 

Lemma 5. Let e > 0 be given and let 6 > 0 be chosen so that 
[f’) — f(E")| < e/2 whenever |t' — t'’| < 6; then for d(x) < 6 the 
sums (1) differ by at most e(b — a). 

We consider first the integral sums 


n-1 m-1 
s= Sf(i)AG, 8 = Sih) Ak, 
j=90 k=0 


where the partition of the primary interval corresponding to the 
second sum is a refinement of the partition corresponding to the first 
sum. In this case each term f(t;) A ¢; of the first sum is replaced in 
the refinement by a quantity of the form 


ft 4 ta + +h) A tee 


By hypothesis each of the quantities /(é;), ..., f(é) can be replaced 
here by f(t;) with an error in the norm of less than ¢/2: 


f (G1) = f(t) ay hy, sais f (jr) = f&) +h, > [| hell < $ * 
Hence 


[Mj br + + GAG FAG] SDV Gs <5 45, 


THE FOURIER TRANSFORM 375 


and therefore 


é é 
SAu= 


Is —s'| <2 
Now let s,, s, be any two integral sums, with the sole restriction 
that the elements of the corresponding partitions do not exceed the 
given 0. We form the integral sums corresponding to the partition 
obtained by superposing the former partitions. Then by what we 
have proved 


Is—al< 50-4), Is-sl<5@-2), 


from which we get the required result 
I's; — 83] < (6 — a). 


It is easily deduced from lemma 5 that as the interval [a, b] 
undergoes indefinitely refined partitions the sums (1) tend to a 
limit. For let z, be an arbitrary sequence of partitions with 
d(z,) > 0. By what we have proved the corresponding integral 
sums s, form a fundamental sequence; we denote its limit in & 
by If. Any other sequence of integral sums s, with d(z,) > 0 
has the same limit, since, as we have shown, ||s, — s;,|| > 0. We 
shall call the element If the integral of the function f(t) over the 
interval [a, b]. 

The integral of an abstract function possesses the usual proper- 
ties of an integral 


Igt+g=If4+T1q9; (2) 
I(x f) =alf; (3) 
if |f|<M@, then [If] =< M(b—a). (4) 


The product of a continuous abstract function /(é) by a real 
continuous function f(é) is again a continuous abstract function. 
If in addition 6 (f) = 0, |f| S M, then 


b 
\Z(BN| <M f piyae. (5) 
All these properties are proved by taking limits in the relevant 


integral sums. 
13* 


376 MATHEMATICAL ANALYSIS 


An important example is furnished by abstract functions of a 
parameter ¢ which take values in a normed space F of ordinary 
functions of an argument 2, so that f(t) = (2, #). 

It is evident from the definition of the integral of an abstract 
function that in this case the operation Jf and the ordinary inte- 
gration of the function g(x, é) with respect to the variable ¢ lead 
to one and the same result. 

5. In this article we shall prove theorem 2: the arithmetic means 
of the partial sums of the Fourier series of any function p(x) that 
belongs to a homogeneous space R converge to p(x) in the norm of R. 

Before carrying out the proof, we shall obtain an expression for 
the arithmetic means of the partial sums of the Fourier series. 

We put 

age) = 2002) + Ala) + + onale) 


n 


m 
where s,,() = >) a, e'** is a partial sum of the Fourier series of 
—-m 


the function (x). As we saw above 


Fi 


_ i sin(m + 4)¢ 
13 
hence ‘ 
1 * n=1 sin(m + 4) # 
= —_ a hf ea 
On (x) an p(x — 2b) = dé. 
sin 


The sum under the integral sign is easily calculated if we multiply 
the numerator and denominator of each term by sin ¢/2: 


t 
sin(m + $) ésin — 


not 2 "= cosmét — cos(m + 1)t 
2, t = a t 
m= a) m= a) 
sin? — 2 sin? — 
2 “2 
sin? — ¢ 
te 1—cosnt _ 
= oS 
2 sin? — sin? — 
2 2 


THE FOURIER TRANSFORM 377 


Thus we get: 
1 sin? alae 
On (#) = Inn g(x +t) ——~— dt. (1) 
sin? — 
The function - 
1 sin? —} 
F(t) = 
nlf) 2nn . at 
sin? — 


is known as Fejer’s kernel. In contrast with Dirichlet’s kernel 
Fejer’s kernel is non-negative. Further, if p(x) = 1, then s,(x) = 1, 
6, (2) = 1, and (1) implies that 


f Fa()dt =1. 


~% 


We turn to the proof of theorem 2. We show first that it holds 
for any trigonometric polynomial 


n 
g(x) = Dia,e'™. 


We express the arithmetic mean of the first p partial sums of 
the Fourier series of y(x) in the form 


So() + ov + Sn(@)  Sna(@) +" + 8p) 
Pp Pp 
Sola) + + + Sn(%) | P- 


nr 
= + x), 
> P p(x) 


Sp (%) = 


since for q > » the partial sums s,(x) coincide with the function 
(x) itself. As p —> oo the first term tends to zero and the second 
to (x); hence as p— oo, we have o, (x) > p(x), as required. 

We now consider the general case. Equation (1) gives o,,(x) as 
the result of operating on an element gm € & with a linear integral 
operator, say A,, with Fejer’s kernel. We apply inequality (5) 
of art. 4 to the integral on the right-hand side, regarded as the 
integral of a continuous abstract function of ¢ with values in the 


378 MATHEMATICAL ANALYSIS 


space R; we get the bound 


x 


lone) =|Angl Slol f Fa) dt = [yl- 


This means that the norm of the operator 4, does not exceed 1 
for any 7. We now employ the following simple lemma: 

Lemma 6. Let a sequence of linear operators A, bounded in the 
norm by a fixed constant K be given on a normed space R. If the 
relation A, y > y holds for the elements p that belong to some every- 
where dense set Q < R, then tt holds for all elements p € R. 

With the aid of this lemma the proof of theorem 2 is quickly 
completed, for we have to establish the relation A, y -> y, where 
the A, are operators with Fejer’s kernel. But we have seen that 
the norms of these operators are bounded by the number 1 and 
that the relation A, gy -—> holds for trigonometric polynomials, 
which are everywhere dense in the homogeneous space & by hypo- 
thesis. By lemma 6, 4,9 > @ for all p € R, and theorem 2 is 
proved. 

It remains for us to prove lemma 6. Let g € & be any element 
and let ¢ > 0 be given. We find an element g, €@ such that 
|e — || <e and a number N such that | A,y, — @ || < « for 
all n > N. Then for n > N we have 


|4ng — gl] S| Ang — Angell + An Ge — Pell + I M.— ol 
sKet+ere, 
and it follows that A, g > gy as n > oo, 

In applying theorem 2 to the space C(—2, x) we obtain the 
following result: every function p(x) that ts continuous on the interval 
[—2, 2] and satisfies the condition p(—x) = y(x) is the limit of 
the uniformly convergent sequence of arithmetic means of partial 
sums of its Fourier series. Theorem 2 was first proved in this 
special form by L. Fejer in 1905. 

Applied to the space L,(—, 7) theorem 2 leads to an important 
uniqueness property: 

If all the Fourier coefficients of a summable function (x) vanish, 
the function (x) itself vanishes (almost everywhere). For it follows 
from the condition of the theorem that all the terms in the Fourier 
series of w(x) vanish; but then all the s,,(z) vanish, likewise all the 
o,(x), and consequently in the norm of L, 

p(x) = lim o,(%) = 0. 
n->-ow 


THE FOURIER TRANSFORM 379 


The same property is alternatively expressed as follows: 

If all the coefficients of two integrable functions p(x), p(x) coincide 
in pairs, the functions themselves coincide almost everywhere. 

For the proof it is sufficient to form the difference f(x) = p(x) 
— p(x); all its Fourier coefficients vanish by hypothesis and there- 
fore f(z) vanishes almost everywhere. 


Problems. 1. A point x is said to be a generalised Dini point for a sum- 
mable function p(x) if the ite 


converges for some c. 
Show that the Fourier series of the function g converges to the value c 
at a generalised Dini point. 


2. A point x, is said to be a (regular) Dini point for a summable function 
(x) if the integral 


‘ |p (%o +t) — p(%)| 


dt 
l2| 


converges. 
Show that almost all generalised Dini points are regular Dini points. 
Hint. Show that every generalised Dini point that is also a Lebesgue 
point for the function g(x) is a regular Dini point. 


3. If f(é) is a continuous abstract function (a St < 6), then 


Paoe | ff sin Atdt |=0 = 

4, Show that the terms of the Fourier series of a function p(x) which 
belongs to a homogeneous space F tend to zero in the norm of R. 

Hint, Interpret the Fourier series term 


4 % 
a,c = ae f ott eine-9 dt = t fowry ent dt 


-n 71 


as the integral of a continuous abstract function. Apply the result of prob- 
lem 3. 


5. If a sequence {y,} of elements of a normed space R converges in the 
eet 
norm to an element y, the sequence of arithmetic means s, ~ Beth 
also converges in the norm to y. 


6. Prove that at every Lebesgue point of a summable function the arith- 
metic means of its Fourier series converge to the value of the function. 
Deduce the uniqueness theorem (cf. above). 


380 MATHEMATICAL ANALYSIS 


Hint. Put u(t) = |p(x + t) — yz), U' (é) = u(t); then 


6 


a 8 
sin? 
aay el eas + =I,+1, 
0 0 1 


= 
2 n 
Ks (3) fue dt< _ (Lebesgue point), 


6 6 


dt=-  U(t) UNG) 


alee ‘i 


a: 4, 
n 


sntagoom 


1 E € 1 
yay ee ene re 


7. Show that condition (3) (p. 367) can be replaced by the result of 
lemma 3 (p. 368) in the definition of homogeneous function space. 


8. Show that every function of bounded variation, continuous relative 
to displacement in a norm equal to the total variation (see problem 6, p. 298), 
is absolutely continuous. 


9. Prove the theorem: if a subset A of a homogeneous space & of functions 
g(x), —x Sz Sa, is uniformly bounded, i.e. ||pi| < C,, and equicontinuous, 
ie. given any ¢ < 0, we can find 6 < 0 so that 


lve +h) — 9(a)||<e for |h| <6, 
then A is compact in R (cf. Arzela’s theorem, Chapter IT, Section 7, problem 5) 
(8S. B. Stechkin). 


Hint. For sufficiently large x the o,(x) form a compact e-net in A relative 
to R (Chapter IT, Section 7). 


2. THe FouRIER TRANSFORM 


1. When we wish to exhibit a periodic function g(x) of period 
2x in the form of a superposition of pure harmonic waves, we have 
recourse to the Fourier series 


p(w) = Se eine, (1) 


THE FOURIER TRANSFORM 381 


If we are dealing with a function of period 21, the correspond- 
ing Fourier series acquires the form 


oo in = 
a)= S/dne ', (2) 


where the coefficients a, are defined by the formula 


1 -in® 
an=s7 f pe” tas. (3) 


-al 


Formula (3) is obtained by multiplying (2) by e~'"+- and inte- 
grating with respect to « between the limits — 7/1 and al. 
It follows from (2) and (3) that 


p(x) = — fy 7 @8) 
Pe peer Pag, (4) 
ae 
It is natural to try to effect the limiting passage 1 > oo in (4) 
with the object of representing a quite arbitrary function g(x) 
defined on the whole axis — co < % < oo as a superposition of 
harmonic waves. The formal passage to the limit J + oo leads to 


the formula 
9 (2) ~ faol fo (g)ei@- vael, (5) 


= ([™s | 


where the symbol ¢ denotes the continuous argument derived from 
the discrete argument o,, = n/l. Thus the required formula for the 
development of g(x) in harmonic waves must be of the form 


g(x) = f y(o)e*do, (6) 
where _ 
yo) = — f pleyerietae. G 


The function y(o) defined by the formula (7) is said to be the 
Fourier transform (or Fourier integral) of the function (x); for- 
mula (6) is said to be the inversion formula of the Fourier trans- 
form or the inverse Fourier transform. The inverse Fourier 


MA. 13a 


382 MATHEMATICAL ANALYSIS 


transform (6) differs from the direct transform (7) only in the sign 
of the exponent and the coefficient 1/22. Sometimes the Fourier 
transform is written in the form 


vo) = | pei dg; (8) 


-0CO 


when the inversion formula assumes the form 


pte) =s— [ plojeitas. (9) 


-0 
To confer the maximum symmetry on the direct and inverse trans- 
forms, it is a common practice to define the direct transform by 


the formula 
fee) 


1 : 
Gees | (&) ef@Fag; (10) 
y Jan 7 
when the inversion formula assumes the form 
y(t) = as [ve et dg, (11) 
y2x 
-090 


Whatever the notation employed, it is evident that in every case 
the Fourier transform is a linear transformation: it carries the 
sum of the functions ¢,(«), y2(x) into the sum of y,(¢), y.(o) and 
the product of the function g(x) by a number / into the product of 
wp(c) by the same number A. 

We shall adhere to the definition (7) and the inversion formula (6). 

2. Instead of proving the validity of the limiting passage to 
formula (5), we shall show immediately that (6) follows from (7) 
under certain hypotheses concerning the function @(z). 

The first hypothesis is, naturally, that w(x) is integrable over the 
whole axis — 00 <x < oo. This ensures the existence of the inte- 
gral (7) for any value of o, — 00 <0 < oo. 

The first consequence of this hypothesis is that the function 
(a) is bounded, is continuous for all o, and tends to 0 as |o| > oo. 
The first assertion derives from the inequality 


vols f lp@laé. 


THE FOURIER TRANSFORM 383 


It also follows from this inequality that a sequence of functions 
n(%) which converges in the metric of the space L,{— oo, 00) 
carries over under the Fourier transform to a sequence of functions 
Yn(o) which converges uniformly over the axis — 00 <a < o™. 
We verify the second and third assertions first of all for the 
characteristic function of an interval (c, d). In this case 


da 


y(o) — ferivsde = 


¢c 


e-ise _ e-ied 


to 


and the expression obtained shows that p(o) is continuous and 
tends to zero as |o| + oo. Since any step function h(x) is a linear 
combination of characteristic functions of intervals, the second 
and third assertions hold for all step functions. Finally, any sum- 
mable function g(x) is a limit (with respect to the metric of 
L,(— oe, oo)) of step functions. By what has been proved, its 
Fourier transform y(a) is a limit (in the sense of uniform con- 
vergence over the o-axis) of continuous functions which vanish 
at infinity. But then y(c) itself is a continuous function and 
vanishes at infinity, as required. 

We now revert to the proof of formula (6). We begin by con- 
sidering the finite integral 


N N( © 
Qn (x) ey eee f p(&) ef2 @-8) see: 
4 Ald 


The interior integral converges uniformly in the parameter o, 
hence we can invert the order of integration: 


foe} N 
Qn (x) = ie fo f oreo sa 
=o Ihe 


el N (z-€)- iN(z-&) 
= Dag: x fine ea 


sin N(x — N 
== foo ar = 2 5 [ve yp Xta @ 


384 MATHEMATICAL ANALYSIS 


The last transformation is effected by means of the substitution 
x — & = —t#. We shall show that if g(x) satisfies Dini’s condition 


é 
pe + - v(*)| dt < ooforsome 6>0. 


then as N - oo the function py (x) tends to p(x). For the proof we 
recall thatf 


1 y 
[Aman 
M4 t 


-o0o 


Hence the difference yy (x) — p(x) can be expressed in the form 


nm 
pn (@) — a [pe +) — ga) de. 
We divide the integral into two parts: 
Loo] 
f- f+ fp. 
-00 {ST |t]27r 
The second term can be written in the form 
si sin Nt xta sin Nt 
rd |t{l2r 


and it is clear that for a given 2 and sufficiently large T it becomes 
arbitrarily small independently of the value of N, N > 1 say. 
For the first term we have 


T 
fPee hee sin wear, 


and since the function p(x + t) — p(x)/|t| is summable over the 
interval specified (Dini’s condition!), it tends to zero as N in- 
creases by lemma 2 of Section 1. Hence 

lim py (2) = pz), 


as required. 


+ Cf. for example A. Ya. Khinchin, A Course in Mathematical Analysis, 
Chapter 26, Section III, Gordon and Breach, New York, 1961. 


THE FOURIER TRANSFORM 385 


Thus if the function g(x) is summable and satisfies Dini’s con- 
dition, it is given in terms of its Fourier transform y(o) by for- 
mula (6). 

We emphasise that in general the integral (6) is not absolutely 
convergent and cannot be defined by the formula 

Ns 
lim. , 
Ny, Na-> 00 Ny 
where N,, N, tend independently to infinity. 

3. We give a few examples on the calculation of the Fourier 
transform. 

(1) To find the Fourier transform of the function 


1 


p(x) = @-a* (1) 


where m is a natural number and 4 a non-real constant; let, say, 
Imi > 0. 
The integral oe aviee 
= | ——__d 2 
vlo) = | yaa @) 
-00 
is absolutely convergent for m > 1, but for m =1 converges con- 
ditionally in the sense 
N 
lim 
N+ oo 


For any m = 1 it is conveniently calculated by the method of 
contour integration. For ¢ > 0 we consider a contour in the plane 
z=a-+7y formed by the segment —N <x <N of the z-axis 
and the semi-circle in the lower half-plane on this segment as 
diameter (Fig. 16). The function e~**? = e~** e’4 is bounded in the 


Fie. 16 


386 MATHEMATICAL ANALYSIS 


lower half-plane for o > 0, and the integral round the semi-circle 
tends to zero in virtue of the well-known Jordan’s lemmarf. Since 
the singularity of the integrand is located in the upper half-plane 
for o > 0, we get y(c) = 0. 
To use Jordan’s lemma when o < 0 we have to take the semi- 
circle in the upper half-plane, and by the residue theorem we get 
Re en taz 
= 2 , —————— . 
p(o) at Res Goa | 
The given residue is easily evaluated if we develop the function 
e-'2 in a Taylor series in powers of z — A: 
—ta(z—A)}r 


co 
en tox — e-to(z-A) g-iod _. g-iod sl 
n=0 n! 


The residue is given by the coefficient of (z — 4)-1; hence 


: (—io)™-} 
oy 1 Se SS 
p(c) = 2xte =I) 
Thus for Im A > 0 
° for o>0O, 
= es m-1 3 
¥(0) |2xie-m oe for o <0. @) 
Similarly for Im A < 0 we find 
| ee al ae ee 
p(s) = (m — 1)! (4) 
| 0 for o <0. 


Any rational function that has no singularities on the real axis 
and vanishes at infinity can be developed in simple fractions of 
the form A/(z — A)", where ImA=.0. Hence the formulae ob- 
tained enable us to write down the Fourier transforms of any such 
rational function. It is easily seen that the functions y(c) given 
by (3) and (4) decrease exponentially as |o|—> co; hence the 
Fourier transform of any such rational function decreases exponen- 
tially as |o| > oo. 


+ V.I. Smirnov, Course of Higher Mathematics, 1951, Vol. III, part 2, 
p. 232 (Pergamon, London, 1964). 


THE FOURIER TRANSFORM 387 


(2) To find the Fourier transform of the function 


g(a) = e742" (a> 0). 


The expression 
foe] 


y(c) = fen e- to2 dy 
-00 
is the integral of the analytic function e~?**-'%*, z = 2 +. iy over 
the real axis. Since 


le-2 (w+ iy)? -io (w+ ty) | = e7az* tay* toy, 


the integrand tends to zero uniformly in y as 7 > + oo in any 
horizontal strip [y| Sy. Hence by Cauchy’s theorem we can 
integrate over any parallel line in the z-plane without altering 
the result: 

co 
y(c) = feaerint e-ta(a+ty) dx 


-o 


co foe) 
=f e722 tay" toy —2aixy-ioz dy — eay*+oy f e7ax* —ix ay +o) dx. 
-00 - 00 
We put y = —o/2a; then we have ay? + cy = —o*/4a and by 
a well-known formulat 


a® oO o® 
= ~ “4a -azi =. ~ 4a = 
p(o) =e ue dz e | 3 


In particular, for g(x) = e~*?(a = 1/2) we get y(a) = V2a e- %2, 
a function of the same form, differing from the original function 
only by the factor V2x. 


Problem. Complete the empty places in the table 


Ne} ple) | y(o) Ne] plz) v(c) 


1 sin? ag 
2 2 4 
+a o 
1 sin? ax 
2 SaRG Goo ig EER 5 i 
T+ 9 + 2%, x2 
sinag 
3 
a 


+ Cf., for example, V. I. Smirnov, Course of Higher Mathematics, Vol. II, 
Chapter ITI, Section 8, art. 78. 


388 MATHEMATICAL ANALYSIS 


Answer: 


-alol X(T — 409) 


v1 (6) =e > Pz (x) =e for x<0 and 0 for z«>0; 
0 


l ; : 
Pg (x) = > for |a|<a and 0 for |z|> a; gq (x) = > for 0< a < 2a, ~— for 


—2a<a< 0, Ofor|x|> 2a; y;(o) = 2 («-/<\ for|a|< 2a and 0 for|o|< 2a. 


4. We now consider the question of the arithmetic means of 
Fourier integrals in the same way as we considered them in Section 1 
in relation to Fourier series. In place of the arithmetic mean of n 
partial sums of the Fourier series, we naturally consider the inte- 
gral mean 

1 2 
on (a) = ap JH (adv. (1) 


Substituting the value of g,(x) given by formula (1) of art. 2, 
we find: 


N © 
1 in vt 
0 —o© 


oe} N 

1 p(x + t) : 
ah [22+ | fain star) a 
~—oo 0 


bo] 
1 fee 1 — cos Nt 
= dt 
Nx t 


I 


t 
o sin? 1 
—0o 
The expression 
sin? —# 
2 
Fy (t) = - 


is called Fejer’s kernel for the Fourier integral. It possesses the follow- 
ing properties: 


(a) Fy) = 0; 


THE FOURIER TRANSFORM 389 


(b) [Fy() dt =1; 
(c) f Fy(t)dt>+0 as N+ oo forany fixed 6> 0. 
|t]2 6 


Inequality (a) is obvious; equation (b) is deduced from the equation 


oe) 


1 sin vt 
— f ; dt 1 (2) 


—~O 


by integrating with respect to the parameter » from ¢ to N and 
then letting « + Of. 
Relation (c) follows from the inequality 


evant if dt 4 


aN 2 aNd’ 
oes tie 


Equation (b) implies the relation 
co 


ov (x) — (e) = f [pe +t)— ea) Fv(dt. (3) 


—oo 


We shall further consider the convergence of the arithmetic 
means of the Fourier integral in different normed spaces. Modi- 
fying somewhat the definitions of art. 1, which relate to functions 
on the interval [—z, 2] we shall call a normed space R# of func- 
tions w(x), — co < % < oo homogeneous if it satisfies the following 
conditions: 

(1) all functions g(x) € R are summable over (— oo, oo) and 
the convergence 9, -> y in # implies the convergence @,, > g in 
the norm of L,(— oo, oo); 

(2) all translations w(x + h) are contained in R together with 
g(x) and 

ple + h)| = |p(@)l) for any real h; 


(3) the norm in # is continuous under translation, i.e. 
lim | g(x +h) — p(z)| = 0. 
h->0 
} The validity of integrating the integrand with respect to the parameter, 


over the interval « < » < N is ensured by the uniform convergence in » of 
the integral (2) over the region. The convergence is not uniform for 0 < » < N. 


390 MATHEMATICAL ANALYSIS 


We prove the following theorem: 

THEOREM: If a function p(x) belongs to a homogeneous space R, 
then the arithmetic means on (x) of its Fourier integral also belong 
to Rand lim oy(x) = ¢(x) in the norm of R. 

Noo 


The proof of this theorem, just like that of the analogous theo- 
rem in Section 1, will be based on the integration of abstract 
functions with values in the space R. We have also to consider 
improper integrals of abstract functions; we give the relevant 
definitions. 

Improper integrals of abstract functions. Let us suppose that 
an abstract function f(t) with values in the normed space RF is 
defined and continuous on the half-line (a, co). We define the 
improper integral 


f f@at (4) 
as the limit of the proper integral 
b 
f #@ae 


as b -» oo, provided the limit exists. 
In particular, if the ordinary improper integral 


f UA@l ae, (5) 


is finite, the improper integral (4) exists. For in this case, for any 
b'" b” 


Be 
| f f(é) dt | Ss f |f@] dé>0 as b' >, b’ +0, 
b’ b 
so that any sequence of proper integrals 
bp 
f fat > &) 


satisfies Cauchy's criterion; the limit (4) exists since R is complete. 
If the integral (5) exists, the integral (4) is said to be absolutely 
convergent. 


THE FOURIER TRANSFORM 391 


The improper integrals f ; f are similarly defined. 


Letting 6 > oo in the inequality 


b b 
| fr@at lls firolae 


we get 


| fiw dt ||s in f(O)|| de. (6) 


The analogous results hold for absolutely convergent integrals 
of the other two types. 

We now proceed to the proof of the theorem. 

The integral (3) can be regarded as an improper integral of the 
abstract function f(t) = [p(w + t) — g(x)] Fy (t), which has a value 
in R for each value of ¢. By hypothesis the function f(é¢) is conti- 
nuous. The integral (3) converges absolutely in virtue of the in- 
equality 


I[p@ +4) — ep) Fv Ol Ss 2le@)ll Fy @ € L,(— ©, ). 


In particular, we get oy (x) — w(x) € R; consequently oy(x) ER. 

Further, for a given e >0 there exists 6 >0 such that 
p(x + t) — v(x) <e/2 whenever |t| <6. Using properties 
(b)-(e) of Fejer’s kernel, we find 


love) -g(@)| Ss f lo@ +t) — 9) Fy@det 
[ts 


+ f lp@+9-9@] Fy@ats 


lt]26 
S max |p(x + t) — p(@)| f Fw()dé+2\|p@| f Fy de. 
{else 00 jefee 


The first term does not exceed ¢/2 for any N, while the second 
becomes < ¢/2 for sufficiently large N > Ny. Hence for N > Ny 
we get 

low (2) ~ p(x) <e, 
which completes the proof. 


We shall apply the theorem to the space R = L,(— ov, ov). 
Of course we have to verify that L,(— co, co) is a homogeneous 


392 MATHEMATICAL ANALYSIS 


space. Conditions 1 and 2 are satisfied here in an obvious way. 
To verify condition 3, we remark the following: 

(a) every characteristic function of an interval is continuous under 
displacement in the metric of L, (this is easily verified directly) ; 

(b) the set 2 c ZL, of all functions continuous under displacement 
is a closed subspace in L, (the proof is similar to that of lemma 3, 
Section 1). 

If we combine (a) and (b) and recall that linear combinations of 
characteristic functions (i.e. step functions) are everywhere dense 
in space L,, we find that Q = L,. In other words, every function 
y(x) € £, is continuous under displacement, as required. 

All in all, we have obtained the theorem: 

The arithmetic means of the Fourier integral of any summable 
function p(x) converge to p(x) on the line — 0 <a < c in the 
metric of L,(— ©, 0). 

As a corollary we get the following uniqueness theorem for the 
Fourier transform. 

If the Fourier transform y (oc) of a summable function @ (x) vanishes 
for all o, then y(x) = 0 (almost everywhere). 

For then p(o) =0, ,(v) =0, oy (x) =0, and therefore (x) 
= lim oy (x) = 0. 

As a further example we consider the space CL,(— oo, co) of 
all uniformly continuous functions g(x) summable over (— oo, 00) 
with the norm 


Io()| = max |p@e)| + f lp@)| de. 
—O<K <0 — oo 


We leave it to the reader to verify that conditions 1-3 for a 
homogeneous space are satisfied. Applying theorem 2, the arithmetic 
means of the Fourier integral of any uniformly continuous summable 
function p(x), — 00 < x < oo, converge to p(x) uniformly over the 
whole axis in the metric of CL,(— co, oo). This is the generalisation 
of Fejer’s theorem to the case of the Fourier integral. 


3. THE FouRIER TRANSFORM (CONTINUED) 


In this paragraph and subsequently we shall denote the Fourier 
operator by the symbol F: 


Flp(x)] = f pe) e- = dx. 


THE FOURIER TRANSFORM 393 


The operator F is, as we know, a linear operator with the inverse 


F-[p(a)] = == [ vlo eae. 


—co 


1. The Fourier Transform and the Operation of Differentiation 


Let us suppose that an absolutely integrable function (x) is 
absolutely continuous in the neighbourhood of any point and that 
its derivative is also integrable over the line — « <a < o. We 
shall explain how the Fourier transforms of g(x) and its derivative 
are related. We observe that in virtue of the hypothesis that g’ (x) 
is integrable the function 


p(n) = 90) + f g'(edg 
0 


has a limit as x > oo; this limit can only be zero since otherwise 
(x) would not be integrable. The same thing applies to the case 
x > — oo. Integrating by parts, we get: 


foe) oo 
Fly'] = f e@) eit dx = (x) e~ iz | et io f p(z) e- za da, 
—bto —oo 


Our observation shows that the first term vanishes and we have 
F[o'] =io Fg}. 


In other words differentiation of the function g(x) corresponds to 
multiplication of the function p(o) = F[g] by io. If the deri- 
vatives of w(x) up to order m are integrable, repetition of the 
process yields the results 


Flp® (x)] = (0) Flp] (4 = 0, 1,2, ..., m). 


Since F[g*(x)] is the Fourier transform of an integrable func- 
tion, and hence is bounded as a function of 7 (and even tends to 
zero as |o| > oo), we have the bound for F[p(«)]: 


[Fle @yll © 
[ok = ok 


Fly] 


Thus, the more integrable derivatives p(x) has, the faster its 
Fourier transform tends to zero at infinity. 


394 MATHEMATICAL ANALYSIS 


In particular, if y(x) is reasonably smooth, its Fourier trans- 
form (oc) will also be an absolutely integrable function. It is 
evident from inequality (1) that the existence of y, g’ and g’”’ 
(in Z,) is sufficient for this. We can confine ourselves to requiring 
the existence of y, g’, with the additional condition that they 
belong to Z,, and not merely to Z,. In fact, as we shall see in 
Section 6, » € L,, y’ € L, implies o p(c) € L,, so that 


Iv(o)| = lowla)-—| <> [lov oe + 


is an integrable function, as required. 
For any differential operator P(d/dz) of order < m 


Flp(T)o| = Pio) Fil 


A linear differential equation in g(x) over the x-axis carries into 
an algebraic equation in w(o) over the c-axis. This suggests new 
possibilities for solving differential equations. But since the ap- 
plication of this method requires that the equations be linear and 
have constant coefficients, it is inadequate in general for the solu- 
tion of ordinary differential equations (especially since we are 
restricted to the class of functions integrable over the whole line). 
For partial differential equations, however, it is found to be useful. 
We shall give an illustration in art. 3 in the example on thermal 
conductivity. 


2. The Fourier Transform and the Resultant 


Let 9,(0), y(o) be the Fourier transforms of absolutely inte- 
grable functions ¢, (x), g_(x); we should like to know what func- 
tion has the product y, (¢), y2(o) as its Fourier transform. We have 


v1(0) valo) = J oi(6) ede f pala) ody 


=f f a(6) p.m) e-#E+ dé dn, 


— oO — 00 


with the double integral absolutely convergent (cf. the note to 
Fubini’s theorem, Chapter IV, Section 5, art. 2). To obtain a single 


THE FOURIER TRANSFORM 395 


exponent we make the substitution 7 = « — &; then we get 


1(0) palo) = f oilé){ f gale — é)e-*zdax} dg 


foe) foe) 


=fe-{ fmOga@—é)dé}de, (2) 


the inversion of the order of integration being valid in virtue of 
Fubini’s theorem. The integral 


(a) = f pilé) pale — &) dé, 


which exists (again by one of the assertions of Fubini’s theorem) 
for almost all x and is absolutely integrable with respect to x 
(by another of its assertions), is said to be the convolute or the 
resultant of the functions ¢,(x), y2(x). Formula (1) shows that the 
product of the functions y,(c), y2(0) is the Fourier transform of 
the resultant of 9, (x), p2(x). 

The convolute of y,(x), v(x) is denoted by @,*9,. This is a 
commutative and associative operation since it is carried under 
Fourier transformation into the commutative and associative 
operation y, W2- 

Problems. 1. Let ea(w) be the characteristic function of the interval 
0<2 <a. Find the resultant 


e, (x)* Cath eh ee (x) P 


Answer. Ci. Fig. 17 


a ath ath ath+bh x 
Fie. 17 


2. Prove that for any p(x) € £,(— ©, 0) 


Ca +n (%) — & (2) 
h 


tim @ (2) « = p(t — a) 
(the limit being in the sense of the metric of L,(— 00, 0)). 

Hint. Prove (2) for the functions y(x) = e,(x) and their linear combinations. 
In the passage to the limit use the boundedness of the norm of the second 
factor. 


396 MATHEMATICAL ANALYSIS 


3. If A is a closed subspace in L,(— 0, oo) which contains together with 
any function g(x) all its displacements g(x — h), then for any function 
p(x) € Ly(— ©, oo) it will contain the resultant of y(«) with p(x). 

Hint. The resultant is a limit of linear combinations of displacements. 


3. Application of the Fourier Transform to the Solution of the 
Thermal Conductivity Equation 
We shall find the solution of the thermal conductivity equation 
du (zx, t) 07 u (a, t) 
ot t—(itéiéi a” (1) 


(— co <2 < o,¢ 20) which coincides with a given function 
Ug(x) at £ = 0. The physical significance of the problem specified 
consists in determining the temperature of a one-dimensional 
homogeneous continuum (an infinite rod) at any instant ¢ > 0 
from its known temperature at time t = 0. We shall make the 
following assumptions: 

(a) the functions w(x, t), u,(x, t), Uz, (a, t) are integrable with 
respect to 2 for — co < x < oo and any fixedi =0; 

(b) in every interval 0 St ST the function u, = (x, t) has an inte- 


grable majorant: 
00 


|, (a, t)| SOx), f @(x) dz < o. 


—oOO 


We shall pass from equation (1) into Fourier transforms by multi- 


plying by e7'°” and integrating with respect to x from 0 to co. 
Condition (b) allows us to write 
oo 0 cad 
f u,(x, t) ee" 87da = — fue, t) e-“* da = u,(G, f), 
= at J 
where 
v(o,t) = ii u(a, t) e-* da 


is the Fourier transform of the required solution u(v, ft). 
By condition (a) and the results of art. 1 


F [zz (x, ()] = —o? F[u] = —o* v(a, é). 
Consequently we obtain the ordinary differential equation 


v,(c, t) = —o* v(g, t). 


THE FOURIER TRANSFORM 397 


We have to find a solution of this equation which reduces at 
i= 0 to 


v9(o) = Fluo()] = f tole) oder. 


— 00 
The required solution is evidently of the form 
v(o, t) = e-** ws (a). 
We know (ef. Section 2, ex. 2 with a = 1/4) that 
] _# 
eet = F e “|, 
im "| 
By the resultant formula (art. 2) 
1 -2 1 -2 


v(a, t) -F| 2 Vat “ | Fb = F| 2)nt # sus) ’ 


and since v(a,t) = F[u(a, t)], we have finally 


oO 
a é 


1 - = J — = 
u(x, t) = 5 Vai? At yy (a2) = Per fe “u(x — &) dé. 


—~ oO 


This form of the solution is known as Poisson’s integral. 


4, The Relation Between the Decrease of a Function y (x) as |x| —+ 00 
and the Smoothness of its Fourier Transform 


We know that the Fourier transform y(c) of an absolutely 
integrable function g(x) is a bounded continuous function of o, 
— co <o < o, which tends to zero as |o| > oo. Let us suppose 
now that not only g(x) but also x p(x) is an integrable function 
over the line — oo < x < oo. Then we can assert that the function 
p(o) is differentiable. For formal differentiation of the Fourier 
integral 


ee] 


f p(x) ee"? da = p(o) 


—o 


with respect to the parameter o yields the integral 


o 
—i [ xox) er dz, 


—oC~ 


398 MATHEMATICAL ANALYSIS 


which converges absolutely and uniformly in o. By a well-known 
theorem on the differentiation of a uniformly convergent integral 
the function y(o) is differentiable and 


y'(o) = —i f cy(u)e-* de. 


We obtain the significant formula 
tF'[g] = Fle¢], (1) 


which shows that the operation of multiplication by a carries 
under Fourier transformation into the operation id/do. As the 
Fourier transform of an integrable function, the function y’ (c) 
is again continuous and bounded and tends to zero as |o| > oo. 
If the functions x p(x), x? (x), ..., a (x) are all integrable to- 
gether with g(x) over the x-axis, the process of differentiation 
can be continued; we get that the function y(o) = F[g] has 
derivatives up to order m which are continuous and bounded and 
tend to zero as |a| > oo, and 


ik FO[p] = Flak] (k = 0,1, ..,m). (2) 


For an arbitrary polynomial P(x) of degree < m we obtain the 
formula 


_d 
P(i 35) Flv) = FIP@¢). (3) 


If all the products 2” p(x) (m = 0,1, 2, ...) are integrable, the 
function F[p] = y(c) has derivatives with respect to o of all 
orders, each one being continuous and tending to zero as |a| > oo. 
We see that the stronger the conditions we impose on the decrease 
of g(x), the smoother is the function y(c) we obtain. 

In this context we can specify an important class of functions 
which maps into itself under Fourier transformation, of course 
with the argument x replaced by o. Let us consider the set S, 
of infinitely differentiable functions w(x) which for all k, g =0, 
1, 2, ... satisfy an inequality 


[ak pa (a)| s Chg: (4) 
where Cyg is a constant depending on the choice of the function ¢. 


We denote by S, the class of the same functions y(o) of the 
argument o. 


THE FOURIER TRANSFORM 399 


We observe first that each of the functions x* @@ (x) is not only 
bounded but also integrable over the line, since in addition to (4) 
we have the inequality 

C;, 
[ak +? pO (x)| S Okes.qs [at pO (w)] SAP. 
Let y(o) = F[g] be the Fourier transform of a function p(x) € S,. 
By what we have proved the function w(c) is infinitely differen- 
tiable and 
#2 y@ (0) = Flat y(z)]. 

Further, the function 2% g(x) is infinitely differentiable together 
with g(x) and all its successive derivatives are integrable, since 
by Leibnitz’ formula they can be expressed linearly in terms of 
the integrable functions x/ p?-/) (x). Hence the functions 


(i 0) y® (0) = (8) Flat 9 (a), 
being the Fourier transforms of integrable functions, are bounded 


for all k,g. Thus if v(x) € 8,, then y(c) € S,. Conversely let a 
function y(c) € 8, be given. We construct the function 


1 ~ . 
pla) =~ | lo) edo. 


The function 22 g(— x) is evidently the Fourier transform of y(c) 
and is therefore contained in S,. But then obviously p(x) € S,. 
By the inversion formula 


foo) 


y(o) = 52 [2x 9(-2) evr dy = [ ot) ez dzx, 


—~o —0o 


so that p(c) is the Fourier transform of g(x). Thus each function 
y ES, is the Fourier transform of a function y € S,. The class 8, 
maps under Fourier transformation onto the whole class S,. We can 
express this fact symbolically by the equation 


FIS,] = So. 
Let us see how the smoothness properties of the functions p(o) 


can be improved by imposing further restrictions on the behaviour 
of the functions (x) at infinity. 


400 MATHEMATICAL ANALYSIS 


Let the product p(x) e*!*! be integrable, where b > 0 is a fixed 
constant. We can assert in this case that the Fourier transform 
y(c) of p(x) is not only infinitely differentiable but analytic. For 
the Fourier integral 


p(s) = f pla)e-*#2 da 


is now defined not only for real o, but also for certain complex o; 
if we put s =o +77 (0,T real), we get 


co ee) 
plo + it) = f plaje#etede = f p(a) ede, 
aes i 


and the integral obtained converges for |t| Sb, ie. over a com- 
plete horizontal strip of the s-plane. We have obtained a function 
of the complex variable s which is analytic at every interior point 
of this strip; for, differentiating formally with respect to s, we 
get 


foe e- 8t(_iaz)da. 


The integral obtained converges uniformly in some neighbourhood 
of the point s (provided it lies within the specified strip) and is 
therefore the derivative of the function p(s). The function p(s) 
is bounded over the whole strip, since 


foe} foe} 
lv(s)| Sf lp@letelde < f |p(@)|ebl#lde. 
65 seks 
It follows, in particular, that to a sequence of functions 9, (x) 


which converges in the norm ||g|| = f |p (x)| e®!*! da corresponds 
-00 


a sequence y,(s) which converges uniformly over the whole strip 
Jz] Sb. 

We can assert further that as g—> -+oo the function 
p(s) = p(o + ¢7) tends to zero uniformly in 7, |t| <6. For this 
is the case for the Fourier transform of the characteristic function 
of an interval («, f): 

e7 ise _ e- ish 


y(s) = f etede =, 


THE FOURIER TRANSFORM 401 


since the numerator of the ratio obtained is bounded for |r| < b. 
The transition to the general case is effected by means of the usual 
limiting process with step functions. 

We observe that in virtue of the last property the integration 
in the inversion formula can be taken, not only over the real axis, 
but over any parallel line that lies in the prescribed strip of the 
s-plane, so that 


foe) foe} 
1 ; 1 
So ox =, ; tO+it)ax 
dd ease free do In fv +inje do. 
ai ae 


Let us make the further assumption that the product of (x) 
with e?l71 is integrable for any b. Then the function p(s) is defined 
and analytic on any strip |t| S 8, i.e. it is an entire analytic func- 
tion; by what we have proved it is bounded over any strip |r| <b 
(the bound depending on 6) and tends uniformly to zero as 0-> + 09. 
In the inversion formula we can integrate over any line parallel 
to the axis of abscissae. 

We can impose a still more severe restriction on the behaviour 
of p(x) at infinity by requiring that its product with the function 
elzl?, » > 1, be integrable. It can be shown (though we shall not 
dwell here on the proof) that in this case the entire function p(s) 
will satisfy a bound of the form 


ly(o +7t)| <Cetl*?, where +35 1. 


Se 


The function y(s) is then said to have an exponential order of 
growth <p in the s-plane. 

The numbers p, p’ both exceed 1; but they vary in opposite 
directions and when p increases without bound, p’ approaches 1. 

Let us suppose finally that the product of y(x) with any in- 
creasing function of |x| is integrable. This property is possessed 
by finite functions (which vanish almost everywhere outside some 
interval |x| <a) and, as is easily seen, only by such functions. 
So let us put p(x) equal to zero for |~| 2 a. Then the Fourier trans- 
form 


p(s) = f pla)e-** dx 


402 MATHEMATICAL ANALYSIS 


is an entire analytic function of s; it admits the following bound in 
the s-plane: 


ly(o + ix)| Sf |p@|etllelde = Cel, 


where C = f |y(x)| dz; the function p(x) is said to be an entire 


function of at most first order growth of type <a. Thus the more 
rapidly the function q(x) decreases at infinity, the “smoother” 
its Fourier transform y(c) becomes. Beginning with continuous 
functions y(o), we have progressed through finitely differentiable, 
infinitely differentiable, analytic-in-a-strip, and analytic-in-the- 
plane function before arriving at first-order analytic functions of 
finite type. This is the limit of smoothness for functions tending to 
zero on both sides of the real axis (we know that the Fourier 
transforms of integrable functions always have this latter property); 
it is known that there do not exist entire analytic functions distinct 
from zero which are bounded over the real line and have a slower 
rate of growth in the plane than e*!"! for all a > Of. 

5. We give one of the simplest applications of the theorems 
proved. Let q(x) € £,(a, b) (— oo Sa, b S 00) be a function 
which is non-zero almost everywhere and satisfies the inequality 


lpo(x)| = Ce-alel 


for some positive a. We shall show that the system of functions 
Galt) = 2" H(z) (n =0,1,2, ...) ts complete in the space L, 
(a, b) in the sense that the linear combinations of these functions 
form a set which is everywhere dense in that space. 

For if this were not the case, there would exist, by the ortho- 
gonal complement theorem (Chapter V, Section 2, art. 8), a func- 
tion f(x) € L,(a, 6), not identically zero, orthogonal, to all the 
functions ¢, (x), so that for any n = 0,1, 2, ... 


b 
f f@) 2" pola) da = 0. (5) 


The product of the functions f(a) g(x) with any function e/l*|, 
6 <« is integrable; hence continuing f(x) v(x) if necessary as 


{ Cf. for example, A. I. Markushevich, Theory of Analytic functions, 
State Tech. Pub. Dept. 1950, Chapter 6, Section 3. 


THE FOURIER TRANSFORM 403 


identically zero over the remaining part of the line (— oo, 00), we 
get that its Fourier transform g(s) is analytic in the strip |t| < «. 
Since in accordance with formula (2) 


& 
g(a) = f f(a) a” yola) edz, 


it follows from (5) that g™(0) = 0 for all n = 0,1, 2, ... Hence 
g(o) =0 everywhere in virtue of the analyticity of g(s). By the 
uniqueness theorem proved at the end of Section 2, we have also 
f(x) @o(x) =0 (almost everywhere) and consequently f(x) =0 
(almost everywhere), in contradiction to the construction. Thus 
the system 2” q(x) is complete in L,(—a, b) as required. 

For example, taking a = — oo, b = &, 9 (x) = e-™, we verify 
the completeness of the system of Hermite functions 2” e~* 
(Chapter V, Section 2), and with a = 0, b = o, g(x) = e~*, the 
completeness of the Laguerre functions (Chapter V, Section 2). 


4, THe LapLace TRANSFORM 


1. Let there be given a function m(x) (possibly non-integrable 
itself) whose product with e~’* is integrable for some realy. Then 
the Fourier transform of g(x), which may not exist in our original 
sense, is found to exist for certain complex s: 


v(s) = f e@) e-sz dy = f e@) e-i20927 dz, 


in particular, on the line t = y. We see that on this line p(s) is the 
Fourier transform of the integrable function ¢ (x) e**. 

The most important instance of this kind occurs under the condi- 
tions 


|p(x)| << Cet for x>0, 4) 
p(x) = 0 for «<0. 
Hence the Fourier transform 
p(s) = fe) ere iteda = [ p@)e- ida (2) 
6 0 


exists for allt < --«, i. e. in the halfplane of the s-plane bounded 
above by the line t = ~a. As we know already, we can carry 


404 MATHEMATICAL ANALYSIS 


out the integration in the inversion formula, over any horizontal 
line t = —y that lies below the line tT = —«: 


-iptoo 


pla) = f p(s) et ds. (3) 


-iy- fee] 
We introduce a change of variable ts = in formulae (2) and (8). 
When s varies over the half-plane {ms < —«, p varies over the 
half-plane Re p > «. The function 


= fp(ae-Pdax 
6 


is defined and analytic in the half-plane Re p > «; on each vertical 
line of this half-plane it tends to zero as Im p > + oo, uniformly 
over any finite interval of variation of Re p. The inversion for- 
mula (3) acquires the form 


ytio ytioo 
2 AP _ 7 
9 (%) = > il @D(p) er mee —_ (: D(p) cP? d p(y > a). 
y-ico y-too 


The function @(p) is said to be the Laplace transform of the 
function g(x) [which is subject to the condition (1)]. As we have 
seen, the Laplace transform is distinguished from the Fourier 
transform (considered in the complex plane) only by a rotation 
of 90° in the domain of the complex argument. 

The following simple theorem gives general sufficient (but by 
no means necessary) conditions for a given function ®(p) to be 
the Laplace transform of some function (x) which satisfies condi- 
tions (1). 


THEOREM 1. If a function ®(p) satisfies the conditions: 


(a) O(p) ts analytic in the half-plane Re p > yo 
(b) O(p) admits the bound (p = p, + ¢ py) 


|D(p, + i p,)| < Bp), where [ B(p,)dp, = B< 0, 


then it 1s the Laplace transform of a function o(x) which vanishes 
for x < 0 and satisfies an inequality of the form 

[p(a)| < C evr, 
forx > 0. 


THE FOURIER TRANSFORM 405 


Proof. We define a function g(x) by means of the formula 
1 ytioo 
pa =s > [ Ppertdp — > 70). (4) 
y-ico 
Arguing as usual with the aid of Cauchy’s formula and using pro- 
perties (a), (b) it is easily verified that the integral (4) is indepen- 
dent of y. At the same time we have the bound 
[oe] 
1 @ , yr qd B yx 
lpwisse fl (vy +t pa)le Pao e : 
-0O 
For x > 0, letting y tend to yy, we obtain the bound 
lp(x)| Ss Cem; 


for x < 0, letting y tend to + 00, we get p(x) =0. 
If we express formula (4) in the form 


] ; : ; 
GO) = 953 {oo + 4 Pg) eM *iPN? a dp 
-0o 
[o.0) 
erie F F 
So fe. +t pa)e'M™ dpe, 
-0o 


we see that 2% y(—2) e”) is the Fourier transform with respect 
to the variable p, of the absolutely integrable function ®(p, + 7 M9) 
(p, fixed). By the inversion formula 


&9 Lee) 
ae tin = 3 [2x p(-2) omtinns de = [ pe@yerre ae, 
as Z 


so that P(p, + 74 p,) is in fact the Laplace transform of the function 
(x). 
: 2. The Laplace transform is often used to solve both ordinary 
and partial differential equations which describe regulating pro- 
cesses; in such problems the unknown function f(é) vanishes for 
é <0, and for é > 0 must be a solution of some equation which 
satisfies certain initial conditions at ¢ = 0. 
We begin by considering the ordinary differential equation 


ag y(t) + a yP"D(t) + + an y(t) = 6H (1) 


Ma. 14 


406 MATHEMATICAL ANALYSIS 


with prescribed values 
y(0) = Yo, 
y'(0) =", 


y"4 (0) = Yn-1- 


We multiply equation (1) by e~?! and integrate with respect to t 
from 0 to oo. Denoting the Laplace transform of y(t) by 


¥(p) = f y(t)e-Ptdt 
0 


and integrating by parts, we get 


CO 


fy Wertrde 


0 


yltje-Pt|> + pf y(e-Ptdt 
0 


fy" e-rtdt = y e-r'|? + p fy’ Werte 
6 0 


=—-4%+P(—-Y% +p Y¥(p)) = —% — PY t+ PY (p), (2) 


CS 


foe) foe} 
fy MO erPtde = ye-D(he-PS + pf yr-n(t)e-Pide 
0 0 

= —Yn-1 + P(—Yn-2 — PYn-3 — + pr-} Y(p)) 


= —Yn-1 — PYn-2 — + PY (p). 


Multiplying each of the equations (2) by the corresponding coef- 
ficient a; and adding, we obtain an equation of the form 


Ro(p) + R(p) ¥(p) = Bip), 


where Ro(p) is a polynomial of not higher than the (nm — 1)th 
degree in p, &(p) is a polynomial of the nth degree in 7, and B(p) 
is the Laplace transform of the function b(t). We thus obtain a 
purely algebraic equation for the unknown function Y (p). Solving 
it, we find 
B(p) — By (p) 
Y(p) Ror 


THE FOURIER TRANSFORM 407 


and the required solution is given by the inversion formula 


yrio 
1 f ee 


WW) = rs a (3) 


y-ico 


To evaluate the integral (3), we resort as a rule to contour integra- 
tion and the theory of residues, as we did when evaluating the 
Fourier integral of rational functions. We observe that the function 
eP! is bounded in the left half-plane (Re p < y) for t > 0 but not 
in the right; hence any semicircles which form part of the contour 
must be taken to the left of the line Re p = y and not to the right. 
We can take as y any number that satisfies the requirement that 
all the singularities of the function R(p) lie to the left of the line 
Re p = y. 
Example. Let us consider the second-order equation 


ayy" + ay +ay=bsinkt, y=0, y% =9, 


with complex conjugate (non-real) characteristic roots A = « +78, 
A=« —if, wheren <0. 

In electrical engineering an equation of this form describes 
forced oscillations in a resistance inductance capacity circuit 
under the action of a constraining e.m.f of frequency &. Under 
Laplace transformation it gives rise to the equation 


bk 


(ay p® + a, p + ay) Y(p) = [ dsinktortat =a. 
Qo 


Solving this equation, we find 


bk 
Y(p) 


(Qq Pp? + Ay P ++ Gy) (k® + p?). 


By the inversion formula 


ytico 


() = bk { ePtdp 
I Bani J (dy p + yp + ay) (Kk? + p) ” 
y-ico 
We put 
ept 
a) 


(Q p® + ay P + Gy) (K® + p?) , 
14* 


408 MATHEMATICAL ANALYSIS 


The function f(p) has four simple poles in the unextended plane 
at the points +7 k,« +7. Asy we can take any positive number. 
To evaluate the integral we append to the line Re p = y an infinite 
semicircle in the left half-plane; then by the residue theorem 


y (t) = b k{Res f(p)| p = ix + Res f(p)| p= - ix + Res f(p)|p-a+is + 
+Resf(p)|p = « - ip 


The residue at each point is calculated in accordance with the 
general formula for simple poles 


Res 4{?) _ A(Ppo) 


B(p) |p=p, _ B' (po) ; 


Hence we get 


ae ela+ipyt elo i8) ¢ 
oe lacmnie (2 + W) 2i pa, 
eike e7tkt 


+ (la fa, pa, Rik (a, —a,ik pa, 28 |" 


The resulting process is the superposition of a periodic oscillation 
with the frequency of the external force and a damping oscillation 
with the natural frequency of the system; the rate of damping 
is defined by «, i.e. the abscissae of the characteristic roots. 

For « = 0, f =& resonance occurs. In this case the original 
equation is of the form 


y +hy =bsinkt 


and the solution becomes 


The points p = +71k are second-order poles of the integrand. 
Evaluating the residues in accordance with the general rules (for 
multiple poles), we find 


: i 1 ; t 1 
— tktj, oo eee =ThE | oe ee, es 
y (t) bile ( iB + aE) te ( 42 4738 ) 
bt 6b, 
= — 97 Cos ht — oyesin ke. 


The amplitude of the resulting oscillation increases without bound. 


THE FOURIER TRANSFORM 409 


3. The same methods are applicable to partial differential 
equations. 

If the Laplace transformation with respect to f carried an ordi- 
nary equation into an algebraic equation in the unknown function, 
then in the transform of an equation which contains derivatives 
not only with respect to ¢ but also with respect to variables x, y, ..., 
derivatives with respect to ¢ will have vanished while derivatives 
with respect to x, y, ... remain. For a large number of independent 
variables, the simplification achieved is of course slight, but in the 
case of two independent variables t,2, the Laplace transform 
method can be applied to great effect. 

As an example, let us consider the thermal conductivity equation 
dujdt = 0? u/d2? over a finite interval 0 <x </ with initial and 
boundary conditions w,(0,#) = 0, u(l,t) = u,, (2, 0) =u. In 
physical terms, these conditions mean that no heat is lost through 
the end x = 0, while a constant temperature u, is maintained at 
x = 1; at the initial instant the temperature is constant and equal 
tO Ug. 

For the solution of the problem we apply the Laplace transform 
with respect to ¢, i.e. we pass from the function u (z, t) to the func- 
tion 

fee) 
v(%, p) = ferve u(x, t) dt. 
6 


For the function v(x, p) we obtain the equation 
d2v (xz, p) 
coe) — pu(x, p) = —Upy 
with the conditions 
u 
(0, p) =0, v(l,p)=—. 
Pp 
This second-order equation has the solution 


Uy ty — Uy cosh a Vp 


v(x, p) + = 
p p cosh 1 |p 
and hence 
ytioo _ 
sult, t) = ty + tet. fF gor coche Vp dp 
204 coshlVp P 


y-ioo 


410 MATHEMATICAL ANALYSIS 


The integrand is a well-defined function of p with poles at p, = 0 
and p, = 7/l2 (n — 1/2)? (n = 1, 2, ...). We shall show that its 
integral is equal to the sum of its residues at all these poles. For 
this we consider the semicircle T,, in the left half-plane with centre 
the origin and radius n? = x/l?; it passes between two adjacent 


cosh # Vp is bounded over 
cosh 1 'p 

its whole length; then by Jordan’s lemma, as n -> oo the integral 
along T,, tends to zero and the whole integral (1) reduces, as usual, 
to the sum of the residues, 


poles and we shall show that the ratio 


oie 

Instead of considering the ratio cosh Vp on the semicircle T,,, 
cosh LVp 

where |p| = n? (n2/l?) we can replace |p by ¢, and consider the 


cosh x 
ratio coh on the quarter-circumference of the circle ZL, of 
cosh if 


radius n (/l) where the argument varies from 7/4 to 37/4 (Fig. 18). 
Putting ¢ = & + tt, we have t > 0, [§| St, and 


2 


cosh x ¢ 
cosh 1 ¢ 


cosh.a(€ + 47) |? 

cosh 1(€ + ¢t) 

cosh aécosat +isinhxaésinart 
cosh l£ cosit + isinhlésinit 


_ cosh? « € cos* at + sinh? x & sin? xt 
~ cosh? 1 € cos? 2 ¢ + sinh? 1é sin? Ir 
cosh? ] & 


= cosh? 1 & cos? 17 + sinh? 1 & sin? 71° 


(1) 


THE FOURIER TRANSFORM 411 


If |€] =6, then on the circle L, we have |t — n (a/l| <e for 
sufficiently large n, and consequently cos?/7 > 1 — n, where e, 
n are arbitrarily small; hence 

2 2 
cosh «x € e cosh? 1 é es ] (2) 
cosh f¢ (1 — y) cosh? JE l-—y 
If |é| 2 6, we substitute sinh? 7 & for cosh? / & in the denominator 
of (1); then we get 


2 cosh? 1 & 
™~ ginh? J & 


cosh 2 Cf 


ere es = a 2 
cosh It coth? JE < coth? 16. (3) 


cosh x Vp 
cosh | Vp 

by a fixed constant on the given circles. Hence the integral reduces, 
as we said, to the sum of the residues. For the pole p = 0 the 
residue is 1, and at the pole p, = —2?/I? (n — 1/2) the residue 
is easily computed to be 

—_)]\jr na 1\% 
eat e oF (n- +) * cos (n = =) eM 
Finally we obtain the solution in the form of a series 


u(x, t) = Ug + 


4 co (-le 18 /,-1Y, 1\ xx 
para = Wie ete nl ee ney 
+> (ta Mo) 2 nal? U ( ) cos (n 5) tT 


Inequalities (2) and (3) show that the ratio is bounded 


Problems. 1. Complete the empty places in the table 


Ne y(t) Y (p) 
1 i" 
2 cos at 
3 sinh at 
Pp 
4 7p — a 
a 
5 p + a’ 
nt Pp a 
Answer. Yi(p)=—a7> ¥.(?) = aig’ Ys(P) = Sea? 


Ya (t) = cosh at, y,(t) = sin at. 
2. Solve the equation gi") + 4y'" + 4y"=0 under the conditions 


Yo = 9, 41 = 1, yy = 2, yg = 3. 
Answer. 4y(t) = —9 + 15# + 9e72! + Tien?! 


412 MATHEMATICAL ANALYSIS 


3. Solve the system of equations 
y' —3y' +y4+2—-2=0 
—y ty+2"— 52'+42=0 
subject to the conditions yy = y, = 2, = 0, 2 = I. 
Answer, 4y = e' — e®§ + 2 034, 4z = Bel — e3! — 2t e3 
4. Solve the thermal conductivity equation 
ae (c20,t20) 


subject to the conditions u(x, 0) = 0, «(0, t) = a cos wt. 
Answer: 


fo ¢) 
V2 wo - édé 
ei a0 \¥cos(or—2/9)—* fe ttsine VE wea 
0 


Note. A set of examples on the Laplace transform and its application in 
problems of mathematical physics can be found in: H.S. Carslaw and 
J.C. Jaeger: Operational Methods in Applied Mathematics, Oxford (1941); 
I. N. Sneddon, Fourier Transforms, McGraw-Hill (1951). 


5. QUASI-ANALYTIC CLASSES OF FUNCTIONS 


1. The Laplace transform method can also be successfully 
applied to the solution of theoretical problems. As one such appli- 
cation we give an account here of the fundamental theorem of the 
theory of quasi-analytic classes}. 

It is known that a function f(x) of a real variable x, though 
infinitely differentiable in a neighbourhood of a point a, need 
nevertheless not be analytic, i.e. capable of development in a 
Taylor series in a neighbourhood of the point. But if the successive 
derived functions of f(a) do not increase too rapidly, viz. if 

max |f™(«)| <CM"nl, (1) 
|a~2|<6 
then the analyticity of the function f(x) in a neighbourhood of the 
point x, is guaranteed. For the remainder in Taylor’s formula 


Rr (x) = f(x) — f(%) — (@ — 2) fl (%) — - 


ee it f-) (a9) 
(% — 


=F — 20! p(x.) (% — 6 <2, <% +9) 


t From §. Mandelbrojt: Series de Fourier et Classes quasi-analytiques de 
fonctions, Gauthier-Villars (1935). 


THE FOURIER TRANSFORM 413 


in this case admits the bound | #,,(x)| ScM"|x — x,|" and tends to 
zero for |* — %9| < 1/J£; hence in the interval |« — x,| < 1/M 
the function f(x) isthe sum of its Taylor series. Applying Cauchy’s 
formula for the derivatives of an analytic function, it is easily 
verified that, conversely, the analyticity of f(x) in a neighbour- 
hood of the point x, entails the fulfilment of condition (1). Let 
Mo, M1, +, M,, -.. be an arbitrary sequence of positive numbers. 
We form the class C;,,,) of functions f(x) which are defined on the 
line — co < # < oo and satisfy inequalities 


|fM(@)| SCM°m, (n=0,1,2...), 


where C’, M are constants which may depend on the choice of the 
function f. If the numbers m, increase more rapidly than nl, the 
class Cym,) can include non-analytic functions. But, as was shown 
- 1 
by A. Denjoy in 1921, if 5’ (| = oo, the class Cy, possesses 
n=l Mn 

the following remarkable property: if two functions f(x), g(x) 
belonging to the class Cm,» coincide at some point xy together with all 
their derivatives, they coincide identically for all values of x. For ana- 
lytic functions this property has been well-known since Cauchy’s 
time. 

The classes C,,,) in which the coincidence of two functions to- 
gether with all their derivatives implies their coincidence every- 
where have been termed quasi-analytic classes. In 1926 T. Carleman 
gave a complete description of quasi-analytic classes; a somewhat 
simpler formulation was proposed by A. Ostrovsky in 1930. In 
Ostrovsky’s formulation Carleman’s theorem reads as follows: 

THEOREM 1. Let us put 

rr 
T(r) = sup —. 2 
(r) = sup + (2) 
Then a necessary and sufficient condition for the class Cym,y to be 
quasi-analyltic is that 


foe) 


[eta = a (3) 


me) 
1 
For example, let m, = n"*, where « is fixed. Then it is easily 
calculated that 
Tir) ~ ris; 
MA. Ilda 


414 MATHEMATICAL ANALYSIS 


the integral (3) converges for « > 1 and diverges for « <1. In 
accordance with Carleman’s theorem the class Cy,,,) is quasi- 
analytic for « <1 (as we saw above, it may even be composed of 
analytic functions) and non-quasi-analytic for « > 1. 

There exist quasi-analytic classes which are not composed 
solely of analytic functions. It can be shown that the function 
f(x) = D) Tn) cos n x is contained in the class Cym,) and 


My 


is not analytic if —> oo; hence for m, = n! (log 2), say, there 


exist non-analytic functions in the quasi-analytic class C'z,,5. 

2. In this article we shall use the Laplace transform to reduce 
the problem of quasi-analytic classes to another problem, which 
relates to analytic functions in a half-plane. 

Let us suppose that the class C,,,,) is non-quasi-analytic. This 
means that it contains functions f(x), g(x) which coincide at 
x = xX, together with all their derivatives but do not coincide 
everywhere. Without loss of generality we can suppose that x) = 0 
and f(z) #: g(x) for « > 0; we can always comply with these 
conditions by means of translation and the substitution of —x 
for x, ie. through operations which can be carried out within the 
class Cym,y. We consider further the function g(x) equal to 
f(x) — g(x) for « 2 0 and equal to 0 for « < 0; evidently it also 
belongs to the class Cy. Since it vanishes for x < 0 and is bounded 
for x > 0, it possesses a Laplace transform 


P(p) = [ glz)e-Pr dz, (1) 
0 


which is analytic in the half-plane Re p > 0. 
Let us see what properties the function ®(p) possesses. Inte- 
grating (1) ~ times by parts, we get 


p" B(p) = { p(x) e-Pe da, 
0 


which yields the bound 


wo 


|p" D(p)| < CM mn { e-P# dz = OM" my <C,M" m, 
0 


THE FOURIER TRANSFORM 415 


for |p| > y > 0. Conversely, let there exist in the half-plane 
Re p > y an analytic function ®(p) + 0 which satisfies inequalities 
of the form 

|p? D(p)| < CM m, (n =0,1, 2, ...). 


(P(p)) 
p 


It is clear that e~’* satisfies the conditions of theorem 1 

of Section 4; as the integrable majorant required by condition (b) 
1 

we can take, say, Ota Toe . In virtue of this theorem the function 


(x) defined by the equation 
p(x) = : f ae eV dp, (2) 


vanishes for x < 0. Since ®(p) + 0, we have g(x) + 0 for x > 0. 
Moreover (x) has derivatives of all orders and 


1 D(p) 
(n) Se n e( ax 
lp (@)| = 5 f a a aie a 
y-ico 
ytico 
CM" m, / a y {r |dp! 
er p | |r| 
y-ioo 
CG ytico d 
<M mf Pl _oMm, 
2a | 
yrico 


We see that p(x) belongs to the class Cy,,. Since p(x) = 0 for 

x <0 and p(x) +0, the class Cy», is not quasi-analytic. Thus 

the problem of quasi-analytic classes is equivalent to the problem(‘‘ Wat- 

son’s problem”’) of the existence of a function ®(p) = 0 which is 

analytic in the right half-plane and satisfies inequalities of the form 
|p" O(p)| <CM'm, (n = 0,1, 2, ...). 

3. The inverse transformation p = 2y/s takes the half-plane 
Re p > y into the disc |s — 1] < 1, and Watson’s problem reduces 
to the following: under what conditions, imposed on the sequence 
M,, does there exist in the disc |s — 1] < 1 an analytic function 
F(s) + 0 which satisfies inequalities of the form 

| F(s)| < CM" m,|s|?? (3) 
140* 


416 MATHEMATICAL ANALYSIS 


Let us suppose that such a function F(s) exists. We can find o 
such that F(o) #0, | F(e +o e%)| <1 for all real 0, and s = 0 
is the only zero of F(s) on the circle s = 9 + 9 e®. All the subse- 
quent constructions will extend over the disc |s —e| Se. In 
virtue of the inequalities (3) 


n 


|F(e + ee)| s CM" m, "| 1 + e?|" = CM" m, 2g cos 5 


Taking the minimum over v on the right-hand side, we get 


; C 

Flo + ee)| < F 

max M" m, 
n 


6 
20 COS oy 


and by the definition of the function 7 (r) 


C 


|\F(e+ece)|s 1 ? 


T\ 2M 0 


cos iS 
2 
so that 


log | F (e + ge**)| slog C — log T 
2M 0 


re 
2 


The following theorem is well known in the theory of analytic 
functions (we shall give a proof of it in art. 5): if the function f(z) 
is analytic in the disc |z — zg| < h, non-zero at z = Zp, at most 1 in 
absolute value, continuous over the closed dise |z — 2)| SA and 
has a single zero on the circle |z — z9| = h, then the integral 


2x 
~ f log |f (zp) + h e#®)| dO 
6 


has a finite value. 
Applying this theorem to the case we are considering, we find 
that the function 
log T Slog C — log|F(o + o e**)| 

2M o 


cos oo 
2 


THE FOURIER TRANSFORM 417 


has a finite integral with respect to 0 between the limits 0 and 2x. 
If we make the substitution 


6 1 
2M @ COs 2 = rs 3 
we see that the integral 


foe] 


log T(r) 1 
i 2 M? o* d . 
OT a 


converges and with it the integral 


| T 
fistea, ‘ 


a 


Thus if the class C.,,,) is not quasi-analytic, the integral (4) con- 
verges. We have established the sufficiency of Carleman’s condition 
in theorem 1 

4. Proceeding to the proof of the necessity of Carleman’s condi- 
tion, let us assume that the integral (4) converges. Then the integral 


Qn 
log T dé 


2M @ 


cos 2 
2 


also converges and we can therefore construct Poisson’s integral 


2n 
: 1 1 1-7 
a la nel ==] SS Ese 
2 
0 


which represents a function harmonic on the disc r < 1. We put 
G(s — 1) = P(s) and denote by Q(s) the conjugate harmonic func- 
tion on the disc |s — 1| < 1. Further, let 


F(s) = e7 {P (8) +7 O(5)}, 


We claim that the function F(s) satisfies the inequalities 
|F(s)| <ma|s|" (nm =0,1,2, -..). (5) 


418 MATHEMATICAL ANALYSIS 


For the inequalities (5) are equivalent to the inequalities 


e7 P (s) s Mn|s|" (n = 0, 1, 2, ges) 
or 


—G(s) = —P(s +1) Ss logm, + nlog|s +1}. (6) 


Both terms on the right can be expressed in the form of Poisson 
integrals: 

2a 
1 log m,(1 — r?) 


=— | —__- —._—__ 9, 
10g mn 2x J 1— 2rcoss(0—g) +7 
0 


2x 
1 freee + 1] (1 — 7?) 


lee el So) Te pee age 
0 


and hence inequality (6), which is subject to proof, can be expressed 
in the form 
2x 
tog [7 (— 
2 
\ 


5 Mp|1 | 
cos 5 | 


= d6=0 
1 — 2rcos (9 -gy) +7 


(7) 


But {1 + e#°| = 2 |cos 0/2]; since 
ge 
T(r) = su : 
(r) ren Mn 


we have for every n 
ye 
T(r)= = T(r) m,r” = 1, 


and hence the integrand in (7) is non-negative. It follows that (7) 
is true, and therefore inequality (5) is satisfied; in accordance with 
art. 1, the class Cy», is not quasi-analytic. This completes the 
proof of Carleman’s theorem. 

5. In this article we shall prove the theorem from the theory 
of functions of a complex variable that we used in art. 3. 

THEOREM. If the function f(z) 1s analytic in the disc |z — z| <h, 
non-zero at 2 = % at most 1 in absolute value, continuous over the 
closed disc |z — z)| S hand has a single zero on the circle |z —2%| =h 


THE FOURIER TRANSFORM 419 


at the point 2* , then the integral 


Qu 
— f log |f(z + he'®*)| do 
0 


is finite. 

Proof. Without loss of generality we can put z = 0, 4 = 1, 
z* = 1. In the disc |z| Sr < 1 the function f(z) is analytic and 
can have only a finite number of zeros 2, ..., 2m; we assume that 
the boundary |z| = r is free from zeros. 

Consider the closed contour C shown in Fig. 19, composed of 
ares of the circle |z| =r taken in the positive direction, circles 
C;,, (k = 1, 2, ...,m) of very small radius ¢ taken in the negative 


direction, and lines LZ; = [z,, 2'x] joing the curves specified and 
taken twice in opposite directions. The function log f(z) is analytic 
inside the contour C and the value of log f (0) can be expressed by 
Cauchy’s formula 


log f(0) = ai log f(a). (1) 


Let us consider the part of the contour formed by the circle C; 
of radius ¢ with centre z; taken in the negative direction. The part 
of the integral (1) that is taken round C; has the form 


tee? dé 
2; 2; +e eo" 


(2) 


420 MATHEMATICAL ANALYSIS 


If k; denotes the multiplicity of the zero z;, then f(z) = (2 — 29) kf; (2) 
where f; (z;) + 0, and 
jlog f(z)| = |log (2 — 2)kj fj @)| = 1% log (2 — 2) + log f(2)] 
S k;|log |z — 2;| + 2ak,| + [log f;(z)| S & log e| + 6. 
This bound shows that the integrand in (2) becomes arbitrarily 


small as ¢ > 0 and therefore all the integrals round the circles 0; 
tend to zero as ¢ > 0. 

In a negative circuit of the point z; the function log f(z) 
= log |f(z)| + ¢ arg f(z) acquires the increment — 27k; 7, hence 
the integral over the part of the contour formed by the segment L; 
taken twice in opposite directions is equal to 


fF = k;[log 2’; — log 2’). 


On each successive arc of the circle |z| = r the function log f(z) 
acquires an increment —2zk;i relative to the preceding arc 
which gives a corresponding increment in the whole integral (1) of 
the form 


which is evidently purely imaginary. With all this in view we can 
detach the real part in equation (1) as e > 0; we get 


log |f(0) )| = 3 By log |}  fveineo a 


and since |z;| <1, noe ‘| <0, 


] . 
Ff roe inte 01 00 = ogi 
0 


or, what is the same thing, 


2n 


1 ; 
— | 108 I/(re*)| 40 < — log | (0). 
0 


By hypothesis there is a single zero on the circle |z| = 1 at the 
point 2* = 1. 


THE FOURIER TRANSFORM 421 


Choosing an arbitrary 6 > 0, we evidently have 
2n-6 2% 


1 : . 
5 fog [fire a0 = — = f tog |7(r | a0 
6 0 


S — log |7(0)|. 


Keeping 6 fixed, we allow r to approach 1; we get 
22-6 


1 
-3- f log |/(e!)| d0 < — log |f(0)]. 
6 


This inequality holds for any 6 > 0. In the limit as 6 > 0 we get 
the integral 


22 
1 
pein i6 
sa_| 198 Ie] a0 
0 
exists. The proof is complete. 


6. Tue FourreR TRANSFORM IN THE CLass L,(— oo, 00) 


1. A function g(x) whose square is integrable over the whole 


x-axis is not, in general, so integrable itself (e.g. 1/Va? + 1) and 
so, in the ordinary sense, it has no Fourier transform. We shall 
show, nevertheless, that the following proposition holds (replacing 
the theorem on the Fourier transform in the class L,): 

TuEorem. (Plancherel, 1910). For any function y(x) € L,( — 00, 00) 
the integral 


N 
yr (o) = f ple)e te de (1) 
-N 


represents a function belonging to the space L,(— 00, oo) (over o). 
As N -> oo the function py (0) has a certain limit p(o) in the metric 
of L,(— &, co), with 


foe] 


f lp@)Pdo = 2x f |p(z)Pde. (2) 


-o0 


If in addition p(x) belongs to L,(-—- 00, oo) then io) is tts 
ordinary Fourier transform. Hence, even in the general case (when 


422 MATHEMATICAL ANALYSIS 


g(x) & L,(— 0, c)) p(a) is said to be the Fourier transform of 
p(x). 

Proof. Consider functions ¢, (%), @2(v) in the class S, (Section 3, 
art. 4) and let y,(o), we(o) be their Fourier transforms, which 
belong to the class S,. Then 


ay fn foseeie ae 


~0o -00 


“55 ina fre tea crore 


The inversion of the order of integration at the third stage is valid 
in virtue of the absolute convergence of the double integral 


i f ly:()|lpa(z)| dado. 


In particular, for p,(x) = 9_(x) = 9(#),y1(c) = y2(o) = pla), we 
get: 


J lp@)Pdo = 22 f |p(@)tde. (3) 


Further, let g(x) be a square-summable function which vanishes 
for |z| = a. We can form a sequence of functions ¢, (x) € S which 
vanish for |x| = a and converge to g(x) in the metric of L,(—a, a). 
Since by Cauchy—Bunyakowsky inequality 


if |f(x)| dx <|/ faz / f fe tae 
-12|/ f fe) de, 


for any function f € £,(—a, a), we have 


f lp @) — n(z)[ de s 12 |/ f le@) — g(@)2da>0, 


a 


THE FOURIER TRANSFORM 423 


so that ¢,(x) also converges to v(x) in the metric of L,(—a, a); 
and since both ¢, (x) and g(x) vanish for |x| = a, this convergence 
holds in the metric of L,(— co, 0). But then the Fourier trans- 
forms y,(o) of the functions g,{x) converge uniformly for 
— oo <o < o to the Fourier transform p(c) of the function ¢(z). 
In addition the functions y,(o) form a sequence which is funda- 
mental in the metric of L,(— 00, 00), since by what we have proved 


f lyn) — Ym(o) 2 do = 20 f |gn(w) — pm(z) 2 da. 
Tt follows that y(c) = lim y,(c) belongs to LZ,(— ov, oo) and let 


f [p(o)|2do = lim f pra) |? do = 


n—->CcoO —& 


= 27 lim [ lna(a)ftae = 20 i |p(x) 2? da. 


no —O 


In effect the limits of the last two integrals —a, a. 

Finally let (x) be any function in £,(— 0, 00) and let 
@n (x) be equal to p(x) for |z| < N and 0 for |x| = N. By what we 
have proved the Pa transform wy (oc) of the function py («) 
belongs to L,(— 00, oo), and 


if |py (c)|2 do = 2x f |pn (x) |? da, (4) 
while also 


f lyy (o) — paulo) do = 2x f |pv (v7) — pu (x) |? de. (5) 


But the sequence @y (x) converges in the metric of L,(— oo, 00) 
to g(x) and is therefore fundamental; it follows from equation 
(5) that the sequence yy(c) is also fundamental. Putting 
p(c) = lim wy (0), it follows from (4) that 


fipjtac = 20 tim f lpw(o)/*ao 


N->oo —00 


= 27 lim fies (o) Pda == 2% five Pda. (6) 


N>w —0 


424 MATHEMATICAL ANALYSIS 


Finally if p(x) belongs also to L,(— 00, 00), we shall have for the 
function @y (x): 


A [pn (x) — o(x)| dz>0, 


whence it follows that py (o) converges uniformly to the (ordinary) 
Fourier transform of g(x). But since the wy (a) converge in the 
mean to (a), this transform is none other than y(c) itself. 

This completes the proof of Plancherel’s theorem. 

A somewhat more general relation than (6) is easily verified, 
namely, if (x), @,(x) are functions in L,(— oo, oo) and y,(o), 
pe(o) are their Fourier transforms, then 


a 1 (0) y2(a) do = 2n fate ) yg (x) dx 


For the proof it is sufficient to apply Plancherel’s theorem to the 
function ¢, (x) -+- g(x) and compare the results on the right —and 
left hand sides. 

2. The relationships between smoothness and rate of decrease 
of the Fourier transform of a function, obtained in Section 3 for 
integrable functions, still hold for square summable functions. 

Suppose first that the square summable function ¢(z) is locally 
absolutely continuous and its derivative g’ (x) is also square sum- 
mable. Then the Fourier transform of q’ (x) is the derivative of the 
Fourter transform yw (a) of (x) on to. 

For y(x) now has a limit as x > oo, since 


1p (x) |? = p(w) p@) = p(0) p(0) + Pree ede + 


+ fo'le )pé)d 


and yy’, py’ are integrable over an infinite interval; and obviously, 
the limit can only be zero. 

We use this fact to construct a sequence of finite absolutely 
continuous functions @, (x), so that 


Po (x) > p(x), 
P, (*) > @ (2). 


THE FOURIER TRANSFORM 425 


in space L,(— oo, oo). We actually take as y,(x) a continuous 
function equal to v(x) on the interval (— », v), 0 outside the inter- 
val U, =(—yv — |p(—»)|, » + |p(y)|) and linear in the two 
remaining intervals; since y(y) > 0 as vy > + ©, ,(x) tends to 
y(«) in the metric of L,(— co, co). Further, g;(x) coincides with 
g’ (x) in the interval (—», v), is equal to 0 outside U, and is equal 
to +1 in the two remaining intervals; obviously, y,(x) tends to 
gy’ (x) in the metric of L,(— 0, ov). 
By Plancherel’s theorem, 


Fle, (z)] = (0) > Fle(x)] = yo); 
Fly’, (x)] = toy, (0) > F(y'(z)] = y, (0), 


in space [,(— oo, co), and since the sequence 7 o y,(o) evidently 
tends to the function 7 o (ca), we get 


vi(c) = Fly'(x)] =toy(o), 
as required. 

Suppose conversely that x y(x) is square summable along with 
g(x). We show that the Fourier transform w(o) of (x) is locally 
absolutely continuous and F(x y(x)] = 7 y' (0). We put g, (x) = p(x) 
for {| <¥v and 9,(x) = 0 for |x| > v; then y,(z) > y(x) in the 
metric of L,(— 0, oo). Obviously, x p,(x) > # p(x) also in the 
metric of L,(— co, oo). Let Fla g(x)] = ig(o); we get 


y(o) = Fly,(z)] > Flp@)] =yle), 
ty’, (c) = F[z9,(«)] > Fl p(@)] =t9(0). 
By the lemma of art. 6, Section 3, Chapter IV a subsequence 
y,,(0), convergent almost everywhere to y(c), can be extracted 


from the sequence y,(c). Let e.g. y(o,) = lim y,,(o9). We have 
further 


f Lyn (6) — p'rm (E)] AE = [yn (0) — Yr (0)] — [Pre (0) — rm (80) I; 


on the other hand, 


a 


flv) — val dé S Vo — a6] 


Gs 


V fiwate) - v'nleyitag 0, 


426 MATHEMATICAL ANALYSIS 


and it follows that the sequence y,, (c) is uniformly convergent in 
any finite interval. But now p(c) = limy,,(c) is a continuous 
function. Also, 


vo) — plo») = lim f [y,(A]aé = f g(é) 46, 


and it follows that p(c) is locally absolutely continuous and 
w' (a) = g(a), as we asserted. 


3. A Theorem of Wiener and Paley 


If a square-summable function g(x) vanishes outside the interval 
[—6, 6], its Fourier transform y (0), in addition to being also square- 
summable, can be continued analytically in the plane s = o + iT. 
For the expression 

b 
p(s) = f p(x) e-** dx 


-6b 


is defined for all complex s = o +77. It represents an analytic 
function of s and satisfies the bound 


b 
lvis)| Ss f lp@le=laz <CellsCel*) 
-b 


An entire analytic function y(s) which satisfies an inequality of 
the form 
lpi] < Cele, 


is said to be a function of exponential type <b. We see that the 
Fourier transform of a square-summable function which vanishes 
for |x| > 0 is an entire function of exponential type <b. We shall 
devote this article to establishing the converse. 

THEOREM (Wiener—Paley). If an entire function p(s) of exponen- 
tial type <b is square-summable over the real axis, it is the Fourier 
transform of @ square-summable function v(x) which vanishes outside 
the interval [ — b, 6). 

Before proving the theorem we shall obtain bounds for the 
coefficients of the Taylor development of entire functions of 
exponential type. 


THE FOURIER TRANSFORM 427 


It is well known that the coefficients of the Taylor development 
of an analytic function 


p(s) = 3, 8", 
are given by Cauchy’s formula: 
eeu vs) = 
Qn py eck ds (n = 0, 1, 2, ...). 
[sl=r 


If w(s) is an entire function of exponential type <b, we obtain 
bounds for the numbers a,, in the form 


eur 
|@,.| a (n = 0, J, 2, ...). 
Taking the minimum over r, we get the inequalities 
bh \n 
lan| < o(=) (n = 0,1,2, ...). (*) 


We show by starting from the inequality (*) that the partial 
sums of the Taylor series of y(c) have a common majorant of the 
form C, e®*9" for any ¢ > 0. For 

eb|s| 1 
pect a ee ag 
ne 2 
as from a certain N, which can be taken between the numbers 
2eb|s| and 2e6|s| + 1, and consequently 


oo foe) b n loa) 
San s| so 3 (21) 20 3-20. 
N N n N 2n 


On the other hand, (e b|s|/n)", regarded as a function of n, attains 
at n = b|s| the maximum value e?!*!, as is easily verified by diffe- 
rentiation. Hence 


© 
2 |@n8"| = = ee + PY |@n 8”| S O(2eb|s| + 1) ele l+ 
0 
+C < C,e+4 |s] 
for any ¢ > 0, as eles 


We now turn to the proof of the theorem itself. Let y(s) = S a8" 
| 
be an entire function of exponential type <b, square- sgamsmnable 


428 MATHEMATICAL ANALYSIS 


over the real axis, and let p(x) be its (inverse) Fourier transform; by 
Plancherel’s theorem, we have for any function u(x) € L,(— co, 0) 
and its Fourier transform v(c) € D,(— oo, 00) 


f y(o) v(o) do = 2x fi p(x) u(x) da. 


Let us suppose that »(o) vanishes outside the interval [—c, c] with 
the result that u(x) is an entire analytic function of the argument 
z=a2-+iy. Then the series 


ylo) v(o) = 3} ay 0% v(o) (1) 


converges in the metric of the space L,(— co, oo), Hence we obtain 
the integral 


f yo) v(0) do 


~oo 


by term-by-term integration of (1); we get 


f plow (odo = 5 On fo (o\do = agitate (0). (2) 


Equation (2) also holds under more general hypotheses, when 


v(o) is not necessarily a finite function but merely decreases with 
CO 


sufficient rapidity for >/ a, 0" v(c) to remain convergent in the 
n=0 

metric of the space L,(— 00, co). Since, by what has been proved, 
the partial sums of the Taylor series of the function p(o) share a 
common majorant of the form C, e®+!*! for any ¢ > 0, it is enough 
for the function v(c) to decrease as {a| — oo more rapidly than 
el ¢ > b. 

We denote by £, the class of functions v(o) which satisfy the 
inequality 

lo(o)| sCe-lel (c > 0). 


We shail show that if the function v(c) belongs to #,, then each 
of the equations 


w'(o) £e,w(o) = v(0) (0<¢, <e) 


THE FOURIER TRANSFORM 429 


has a solution in the class Z,,. For, taking the “+” sign, we can 


take as solution 
Co 


w(o) = ea f ert y(A) da. 


It is clear that the integral on the right-hand side exists for all oc. 
Further, as o > + co we have 


eo 
|e(o)| sen? f ev*|o(a)| da < Cre, 
and as ¢ -> — oo 7 


¢ elateye 
Jw(o)| s ear f erecrd, = CS ae < Ciel 9! < Cyeede! 
G +e 
~00 


so that w(o) € E,,. Similarly for the “ —” sign, we take as solution 
wo 
w(c) = ew f er y(A) da. 
e 


Now let us consider, independently of the foregoing construc- 
tions, the series 


5 a, i” w(0), (3) 
n=0 


where the numbers a, satisfy the inequalities |a,| << C (eb n-1)" 
and u(x + 7 y) is an arbitrary function, analytic in the dise |z| Sb. 
We show that the series (3) is convergent. Every function u(z) 
that is analytic for |z| <6 is defined and analytic on some disc 
|jz| Sb +e, and by Cauchy’s formula 


wogeg f Oe, 


Qrt greet 
|[z |=b+e 
hence 
Cn! 
(n) se 
MOL S Baye (4) 


Substituting C, n”*/? e-*for n!; in accordance with Stirling’s for- 
mula we get 


{ue (0)| < ¢,| (5) 


wr 


430 MATHEMATICAL ANALYSIS 


and therefore 


~% 5 ( Ji <0, (6) 


as required. 
Consider now the functional 


P(u) = an { p(x) u(e) dx = {vl u(o) da, 


which is defined over functions u(x) € L,(— co, co); its value over 
functions w(x) which belong to the class H,, c > 6, is given by 
formula (2): 


@(u) = Sa, iu (0). 
n=0 


As we have shown, this formula can be taken to define it over 
the totality of functions that are analytic in the dise |z| <b. In 
addition, the functional ®(u) is continuous in the following sense: 
if the functions w,,, (2) are defined and analytic on a disc jz| Sb +6 
and converge uniformly to zero as m -> co over the whole disc, 
then ®(u,,) > 0. This follows from equations (4)-(6), where we 


can evidently substitute the quantity max |u(z)| for the con- 
[zisbt+e 
stant C. 


We must show that the function g(x)—the Fourier transform 
of the given function y(c)—vanishes for |x| = a. Let ¢/2 < e, < 
< fg << << Em << + > e and put 
m Zz 2 
Flea) 

Um (2) om z 2 
1 
ZZ 1 | r (5 + a) | 
where uo(z) € L,(— 00, 00) is some entire function whose Fourier 
transform v,(c) belongs to the class F,,,. 

Resolving the coefficient of «,(z) into partial fractions, we obtain 

a representation of w,,(z) in the form 
m A m . 
Um (2) = Ag Uo(2) + 3’ ——*— tg(z) + 3) ———tH(2). 


k=l ve ke 


Up (2), 


THE FOURIER TRANSFORM 431 


Let 
tz \-t 
wee) = (1+ 7) ual, 


az 


m2) = (1-5 ~*-)" uote, 


and let vj (0), v,(o) be the corresponding Fourier transforms; by 
formula (3) of Section 3, art. 4, we have 


ee d 1 
U(a) = UK (c) f do b ae Ex UE (c), 
ad 1 _ 
U(o) = % (0) — ao baa, ho 


Since the function vg(c) belongs, by hypothesis, to the class H,,, 
and the numbers 6 + ¢; are less than b + ¢, by what we have 
proved vj, v; belong to the class H,,,,; it follows that the func- 
tion v,,(¢) = F[umn(x)] belongs to the class Z,,1,2,- Hence at the 
function v,,(c) the functional (2) assumes the value 


D(Um) = f vo) Um(o) do = > Ay, 0” wf) (0). 
-0 n=0 


As m -» co the sequence of functions w,, (2). 

(a) converges to 0 uniformly over the dise |z| <6 + e/2, 

(b) converges in the metric of L,(— 00, oo) to a function equal 
to u(x) for |x| > b + « and equal to 0 for |~| <b +6. 

In virtue of what we have proved ®(u,,) > 0. At the same time 


Blum) = f Ge) Umn(a)dx> | p(w) ug(a) de. 


[a[2br+e 


It follows that 


(2%) u(x) dx = 0. 
|c|zbr+e 


We can write this equation in the form 
[pote ) Uo(x) da = 0, 


where y, (x) is equal to p(x) for |a| > 6 + eandtoOfor|z| <b +. 
The function u,(x) here is any entire function whose Fourier trans- 


432 MATHEMATICAL ANALYSIS 


form belongs to the class H,,,. Since the aggregate of such func- 
tions is complete in the space L,(— oo, co)f, we have g(x) =0 
(almost everywhere), and hence g(x) = 0 almost everywhere that 
Jz] > 6 + €; since ¢ > 0 is arbitrary, p(x) vanishes for |x| > 6 
almost everywhere, as required. 


Problems. 1. Let EF denote the totality of functions y(c) which are entire 
functions of the argument s = o +71 of exponential type < b and such that 


f lv()Pdo=1. (*) 


-00 
Let G be a bounded measurable set on the o axis. Show that 
6 (G) =sup fiv(o)Pdo< 1. 
pet 


Hint. Use the Wiener-Paley theorem to show that the functions p(o+ite# 
are uniformly bounded in any circle. Show by Cauchy’s formula that the 
first derivatives of the functions y(s) are bounded in any circle. By Arzela’s 
theorem (Chapter II, Section 7, problem 5), the set # is compact in any 


circle Q. If there exists a sequence y, € # for which f |y,(c)P do +1, on 
@ 


choosing a subsequence from it, uniformly convergent in Q > 2, we obtain 
the equations for the limit function ye): 


fiv@de=1, Sip (odo=0, (*) 


whence p(c) = 0 in Q — G, so that y(c) =0, which contradicts (*). 
2. (continued). Show that 
6(G) s2bnG 


or any set G@ of finite measure. 
Hint. Apply the Cauchy-Bunyakovsky inequality to the expression 
for y(c) in terms of ®(z). 


3. (continued). Show that the result of problem 1 still holds for any set 
G of finite measure (B. P. Paneyakh). 

Hint. G =G,+G,, where G, is bounded, 2b wp G, < 1. If 6(G) = 1, we can 
find as in problem 1 a sequence y,(c) convergent to zero uniformly outside 
G, and hence on G,, so that J lyse) Pda ->0; but now 0(G,) =1, which 


contradicts the result of problem 2. 


T For example, we can take as the function u,(x) the Hermite function 
x" e~*" (cf, Section 3, art. 5). 


THE FOURIER TRANSFORM 433 


4. Show that the Fourier transform (s) of a function D(x) € L_(— ©, ©), 


equal to zero for «<0, is characterised by the condition: y(o + i7) is ana- 
lee) 


lytic for 7< 0 and f |p(o+it)Pdo<C for all + <0. 


foe] 
Hint. Use Plancherel’s theorem for t < 0. 


4. The Uncertainty Principle in Quantum Mechanics 


In investigating the motion of a material particle M in quantum 
mechanics the quantities sought are not the coordinates of the 
particle and its velocity, as in classical mechanics, but the proba- 
bility distributions of these quantities. For simplicity we shall 
consider the one-dimensional case. Then the functions sought are 
the following: 

(1) The position function (x). This function is defined on the 
whole line — 00 < x < oo and satisfies the condition 


foe) 


f le@)P de =1; (1) 


-0 


it determines the probability that (at a given instant) the material 
particle Mf will be situated in the interval («, ) in accordance with 
the formula 


6B 
Prob {x € («, B)} = | |y (x) |? da. 


(2) The momentum function w(p). This function is defined on the 
line — co < p < ow and satisfies the condition 


fo.2] 


f lp(p)P dp = 2a. (2) 


-~ 


It determines the probability that the magnitude of the momen- 
tum of the particle (the product of its mass and its velocity) will 
lie in the interval (y, 6) in accordance with the formula 


1 6 
Prob {p € (y, 6)} = and |p(p)P dp. 


One of the fundamental axioms of quantum mechanics (we shall 
not go into its physical significance) consists in the hypothesis 


434 MATHEMATICAL ANALYSIS 


that the momentum function is the Fourier transform of the 


position function: 
fo] 


y(p) = f pla)e tre da. (3) 
- 00 
If we know the position function, we can write down the “most 
probable” value (the mathematical expectation) of the position of 
the particle: 


E= f xlp(x)Pdx. 


It can be assumed that ¢ = 0, since we can always effect a trans- 
lation along the x-axis. We observe that the quantity |p(p)| is 
invariant under such translation since for any g(x) 

oO fo =) 

f pla — he ivz da = e-iph f (e)e-iPa da, (4) 
Similarly we can also assume that the mathematical expectation 
of the momentum 


=3q | Plv@e ar (5) 


is equal to zero. An estimate of the scatter of the quantity x is 
given by the mean square deviation (the dispersion) 
eo 
d= [ 2|p(x)Pdx. (6) 
-~O 
The smaller 62, the greater the likelihood that the point M is 
actually situated close to the origin. Similarly the scatter estimated; 
is given by the mean square deviation 


1 foe] 
B=5- f ply(p)idp. (7) 


It is naturally assumed that the functions (x), y(p) are such that 
the integrals (6), (7) exist. Hence, by what was said in art. 2, the 
functions ’ (x), p'(o) exist and are square summable. The func- 
tion x (x), g(x) is integrable in the first degree, whilst the 
integrable function x y ¢, which has the integrable derivative 


THE FOURIER TRANSFORM 435 


9 +299 +x—q", Vanishes at infinity. We shall establish the 
inequality 

6262 =f, (8) 
which is said to be the uncertainty relation. It shows that the more 
accurately we know the position of the particle (i.e. the smaller 6,) 
the less accurately we know the magnitude of its momentum (ie. 
the greater 6,), and conversely, so that the simultaneous existence 
of functions p(x), y(p) determining with great precision both the 
position of the particle and the magnitude of its momentum is 
impossible. 

For the proof of the relation (8) we consider the integral 


= f laxg(x) +9'(a) Pde 


where « is a real parameter. Using the result |2|? = zz, we find 


foe] 


I(x) = f («zp +o’) (ep +o')de 
-0 
=at f atly)de ta fa (py + 9'p) da + i |p’ Pda. 
-0 -O 
In addition 
f 2|pPdex = 63, 
Jeep +e¢'p)de= f apy d 


= 2D GF vo =~ flgpPde =-l, 
and by Plancherel’s theorem, since F[g'] = 1 p p(p) 
lee] 1 foe) 
, eta es — §2 
a |p’ (@) PP da = 5 J p*|p(p) 2 dp = 63. 


Hence 
I(x) = «0? 63 ~a + 63. (9) 
Since by construction J(«) 2 0. the quadratic trinomial (9) does 
not have real zeros, and consequently 
1 — 46262 SO, 


436 MATHEMATICAL ANALYSIS 
which is equivalent to the required inequality 
Cet 
7. THz FourRIeER-STIELTIES TRANSFORM 


The formula for the Fourier transform of an absolutely inte- 
grable function, 


foe} 
= f e- 2? @(x) dx (1) 
-00 
can be expressed in the form of a Stieltjes integral 
p(s) = f e-#*d&(z), 
-0 


where 


D(x) = f plédé 


is an absolutely continuous function of bounded variation on the 
line — 00 < % < o, 
It is possible to consider general integrals of the form 


= f eizdd(2), (2) 


where @(x) is now an arbitrary function of bounded variation on 
the line — co < x < oo. An integral of the form (2) is said to be a 
Fourier—-Stieltjes integral. 

The function y(c) defined by the integral (2) is bounded: 


lv(o)| =| J ei dB(x)| < ie |dD(x)| = V%., [9]. 
It is also continuous; for ‘ 
Iy(o") — y(o")| = fe- #20" — e-i2,"'] d@(x) + 
ere — e-i20'') AD(x); 


the second integral becomes arbitrarily small for sufficiently large 
A independently of o' and o”, while the first, for any chosen A, 


THE FOURIER TRANSFORM 437 


becomes arbitrarily small for a sufficiently small difference 
lo’ —o” |. 

But in contrast with the Fourier—-Lebesgue integral (1), the 
Fourier—Stieltjes integral (2) does not, generally speaking, tend 
to zero as |o| > co. For example, if ®@(x) corresponds to a unit 
mass concentrated at the point 2), then 


plo) = f e7§*7 d@D(x) = e-' 


is a periodic function of o. 
Every periodic function y(c) which has a Fourier series develop- 


ment sad : 
yp(o) = Dd} a,er? (3) 
-© 


for which the series of coefficients is absolutely convergent, 
foe) 
D |an| < ~, 
- © 


can be expressed in the form of a Fourier—Stieltjes integral: for 
this a generating function must be chosen which is piecewise- 
constant and has a saltus a, at each point 2 = n. The same is true 
of a more general class of functions, obtained from (3) by replacing 
the exponent in o by i 4, 6, where 4, is an arbitrary sequence of 
real numbers; these functions belong to the class of so-called al- 
most periodic functions. 

We shall demonstrate how the Fourier—Stieltjes integral can be 
applied in proving a theorem which has applications in probability 
theory. 

We shall call a measurable function w(c) (— 01 <a < oo) 
positive—definitive if for any continuous function u(x) and any 
a,b bb 
f fv — 2) uo) wi) do dy = 0. (4) 

An example of a positive—definite function is the function e'°* 
with x fixed; for 


b b 
f fertene u(o) u(y) do dy = f e-#= u(a) do f ein u(y) dy 


= [lei u(o)do |? =0. 


MA. 15 


438 MATHEMATICAL ANALYSIS 


It is found that any continuous positive—definite function p(c) 
can be expressed as a “‘Stieltjes combination”’ of the functions e~ 7; 
we have the following theorem: 

THEOREM (8. Bochner, A. Ya. Khinchin, 1932). Hvery continuous 
positivedejinite function p(a) can be expressed in the form 


p(s) = f e- 77 d@(x), 


where B(x) is a bounded non-decreasing function. 

Proof. We consider first the case when the positive—definite 
function (a) has, in addition, a continuous derivative. Putting 
u(a) = e**, a = 0, b = n in (4), we get 


f [v@ — 1) e# e-dodn 20 
0 6 
or, replacing « — by 2, 
r 8 _ il 
f pla) e# (1 da =f,(é) 20. (5) 
We can interpret (5) as the Fourier transform of the function 


pea(t — Zl) (ay sm, 
0 (|a] > 2). 


a (A) = 


Since, by hypothesis, the function y(c) is differentiable, the 
function ¥’, (A) is piecewise-smooth and by a theorem of Section 2 
(cf. p. 382) the inversion formula 


17 
Prd) = 5 f ev fnlE) dé. (6) 


holds for all A. 
In addition 


1 oO 
az { fnlé) dé = ¥,(0) = y(0). 
~&o 
Introducing the monotone function 


G,(2) = 5 f tule) ab, 


THE FOURIER TRANSFORM 439 


We can write the integral (6) in the form of a Stieltjes integral 
P,(2) = fe d&, (2). (7) 


As n — oo the left-hand side obviously tends to the limit (A). 
By Helly’s theorem (Chapter VI, Section 6, art. 4) the functions 
®,,(x) form a sequence which contains an everywhere convergent 
subsequence; with a suitable renumbering, we can assume that 
the sequence @,,(x) itself converges everywhere to some function 
@(x), which is also non-decreasing and varies within the same 
limits O and w(O). If we were now to apply the theorem of Helly 
(Chapter VI, Section 6, art.3) which ensures the legality of a 
passage to the limit under the Stieltjes integral sign, we should 
get that 


= -ida -tg 
y(A) iim . fe d®, ( - fe dG(z), 
and the proof would be complete. 

When using Helly’s theorem, we have to note the fact the 
interval of integration is infinite, and the function e° '4* is not 
continuous at infinity. In accordance with note 3 after Helly’s 
theorem, we must therefore verify that condition (*) is fulfilled: 
given any ¢ > 0, an N = N(e) can be found such that 

Var @,(x) Se. 
|al=N 
for all n. 
For this, we apply the following lemma: 
Lemma. Given a family of functions ¥, (A) of the form 


W,(6) = f edb, (2), 


where ®,(x) are non-decreasing functions of bounded variation, if 
the family is equicontinuous for A = 0, i.e. if, given any e > 0, @ 
6 > 0 can be found such that 


|Y%, (A) a ¥,(0)| <€é, 
jor all « and |A| < 4, the functions D, (x) satisfy condition (*) (with 
n replaced by «). 


15* 


440 MATHEMATICAL ANALYSIS 


Proof of the lemma. It follows in particular from the equiconti- 
nuity condition that, for |h| < 4, 


h h 
¥.0 ~ 35 f (2) a) = ap | 0210) — Vay) aa 
—h —h 


<5, [ IP.) —W(A)|da <e. 


sin hx 


hz 
for |x| > A (it is sufficient to put A = 1/h); we then get 


Further, having found h, we can find A such that | 


< 1/2 


oo h 


e>¥,(0) - 2 fv.mar~ fav. fe t+ dal db, (x) 
—h —00 


— oo —h 


foe] 


-f{fr- sinh? law, (2) = [ [i - Fe] aot zz fe. 


oo [lz |z|=A 


and condition (*) is fulfilled, as required. 

We return to the proof of the theorem. We see that it remains 
to verify the equicontinuity at A = 0 of the set of functions ¥%, (A) 
figuring in equation (7). But each of these functions is obtained 
from a fixed continuous function (A) by multiplying by 1 — ((A|/n), 
which tends uniformly to 1 as n —- oo. The equicontinuity con- 
dition is clearly satisfied in this case. We have thus proved the 
Bochner—Khinchin theorem, at any rate for the case when y(c) 
is differentiable. 

Now let the positive definite function y(c) be merely continuous. 
We form the symmetric mean of p(c): 


h h 
1 1 
v'(o) = S.v0) = 5, [vlo +2)d8 = 5 [vo - Has 
—h —h 


otk 


1 
=, [vm dy. 
oth 


THE FOURIER TRANSFORM 441 


The function y*(c) evidently has a continuous derivative. We 
show that the double symmetric mean of the function y(c) 
hh 


] 
8,890) = arf [ve +8 +n) dé dy (8) 


—h—h 


is again a positive—definite function. For 


f [8x Sapo — 2) u(o) w(t) do dr 
~ az f[[foe —1t + +n)u(o)u(t)dodrdédy 


= ax [f[[ [ve — 2) wo! - ue Fa) do’ ae’ ae ay 


h h 
= [fv at [5 fuc a6) ag] {oy fee +7) ay do’ dr’ 
—h —h 

— ffye — t') v(0") v(z') do dr’ = 0, 
where 
h h 
v(o) = ule — ag = fue + man. 
-h -h 


Applying the theorem in the form already proved to the double 
symmetric mean y"*(c) = 8, S, p(o) of the function p(o), we get 


co 


yr(a) = f e#7d®, (2), (9) 


—o 
where @,(c) is a non-decreasing function which varies between 0 
and w**(o). The family of functions p*4(c) is uniformly continuous 
at o = 0, for we have 


| yhh(o) — yr (0)| 


hh 
1 
aa f five +& +n) —pl& +)|dédn 
a 


h 


Ss 


<max| yoo +&+n)—- y+), 
é(sh 


and the question reduces to the uniform continuity of the function 
(a) itself in a neighbourhood of the point zero. 


4492 MATHEMATICAL ANALYSIS 


We now let h tend to 0. The functions y’* (a) tend to y(c). The 
sequence ®, (x) contains a subsequence, which we can take to be 
the sequence ®,(x) itself, which converges everywhere to a non- 
decreasing function ®(x) whose values range between 0 and Y(c). 
Using again the lemma proved above, we can once more apply 
Helly’s theorem to the interval — 0 <2 < oo and thus obtain 
from (9) the equation 


foo) 
p(s) = f e-i=*d Ba). (10) 
—C 
This completes the proof of the Bochner—Khinchin theorem. 

Note. A more searching analysis reveals that the requirement 
that the positive-definite function of the Bochner—Khinchin 
theorem be continuous is superfluous. The theorem holds under 
the supposition that the positive—definite function p(a) is measur- 
able; the representation (10) is then found to hold almost every- 
where. It follows as a corollary that every measurable positive— 
definite function can be made continuous by altering its values on 
a set of measure zero. 


8. Tor FourRieER TRANSFORM IN THE CASE OF SEVERAL INDEPEN- 
DENT VARIABLES 


In problems of mathematical physics we have to deal with the 
Fourier transform of functions of several variables. In this para- 
graph we consider the simplest properties of this transform. 

Let (x) = g(a, -.., 2) be an integrable function of n variables 
X1, .., Xp, defined on the whole of the n-dimensional space R,,. 
The Fourier transform of the function g(x) is defined as the func- 
tion 


(0) = YO, ---1 On) 
= foe f erterer te tenn) wey, oe, By) ay, oy day — (1) 


nm 


or in symbolic notation 


y(o) = fei) g(x) de. 
Ra 


If g(x) is the product of functions (2), ..., Mn(%,), each of 
which is integrable with respect to its own variable, then the 


THE FOURIER TRANSFORM 443 


n-ple integral (1) reduces to the product of the ” simple integrals: 


oe oo 
(a) = fore imrda, «++ [ Qn (&n) e-F2™% da, =, (04) --- Yn(Gn); 


— oo —oo 

where y;,(0;) is the usual Fourier transform of the function @, (a). 

In the general case the multiple integral (1) can be expressed, 
by Fubini’s theorem, in the form of a repeated integral 


p(s) = f{ { ip { foe, bey Ba) e-tmn dar} enimordary} i 


ie } e7'Tnn daz,. 


Each of the curly brackets determines the Fourier transform with 
respect to one coordinate with the others held fixed. Inverting 
each of these operations in succession, we get formally 

ao oo Oo 


1 1 1 : 
p(x) ae [ae fagf een seep On) CF 2m don |x 


—0o —~o0 —0o 
x eftn-r%m 1 do, 4 | etm da, 


Or in the form of an »-tuple integral 


1 : 
(x) => P(X, ++; Xn) = way | [ve s+) On) eF @°) do, Ap. (2) 


Since in general the function y(c) is not absolutely integrable over 
R,, formula (2) can only be meaningful if we specify a procedure 
for evaluating the integral on the right-hand side. We shall give 
several possible interpretations of the integral (2) below. 

THEOREM. Let us suppose that the function v(x) = (ay, ---, Xn) 
satisfies the conditions 


1p (%y + ty, yy +++) Un) — P(L, Xe, ++, Xn)| SClE|*, (3) 
|p (ay, _ + ty, Ly, +, Xn) ~ P(X, Me, ---, Mn)| S C(xy)| te|*, (32) 


Ip (2, Xe, vey Bn tn) — V(X, Xe, ee pn) | 
SOR, 1, Ga) Ihnl% — (Bn) 
oe oO foe) 
f O(a) day < 00, 6, fo f Oey, 5 Gna) day da%y_y < 00 
— oo —o —CO 


(0 <« <1). 


444 MATHEMATICAL ANALYSIS 


Then formula (2) ts valid if it is understood as the result of successive 


passages to the limit as N, > o~, ..., N, > oo: 
Ny Ny_1 
1 
a) =o tim f | tim [nfo 
A ) (22)” Ny->00 Ny-a>0o Ny>00 VAS 
— Ny — Naur 
xX ef tnmdg,}eltnrmaday,_ 4 er uadg,. (4) 


Proof. We put 


(01, Xy, «++, Bn) = foc, Lg, v0, ple EMAC Dy. 


— OO 


By Fubini’s theorem the function y(a,, ..., %) is summable with 
respect to x, for almost all a, ..., &p. 

It follows from condition (3) and the theorem of Section 2, 
art. 2 that the inversion formula 


M 
; 1 . 
P(t, --, M) = lim z— | (oy, Xe, «--, %) e'™™ doy, 
Ni>0o 47 
—M, 
holds. The function ¢(0,, #2, .--, %) is summable with respect 
to 2, for almost all x3, ..., x, and satisfies the condition 
1Q (Oy, %y + ty, «+, Un) — Yy(O1, Le, ++, Zn)| 
wo 
s f | p(X, y + ty, «-, Un) — P(%y, Way, T,)| day 
—o 


= lta l*f C(x) dx, = Cy/t,|*. 
in virtue of (3,). Hence for the function 
Po(F1, Fy, %y,y oy Xp) = f y(O1, Vg, «1, ,) em! ed ary 


the inversion formula 
Ma 
(o y= lim = f gal )einned 
1(01, %, +, Tm) = lim oa Po(O1, Fy, Ug, .--, Vp) e§ day, 


Ng->0o 
—N, 


THE FOURIER TRANSFORM 445 


holds and therefore 


P (By, «+65 Xn) 
M 


= lim x {im ze [ne Oy 5 Wisk amend ed gy, 

Ni00 27 Ny 00 
Continuing the process, we arrive ultimately at the required for- 
mula, (4). 

The conditions of the theorem are satisfied, for example, if the 
integrable function y(a,, ..-,%,) has partial derivatives 0/02, 
Op/d%~_, .., I/0x%,, of which the first is bounded by a constant, 
the second is an integrable function of x,, the third is an integrable 
function of x, and x, etc. 

There exist alternative procedures for reducing the n-tuple 
integral to a repeated one. Let us consider the expression of the 
Fourier transform in spherical coordinates. In spherical coordinates 
the integration is taken first for a fixed r > 0 over the sphere of 
radius r, centre the origin, then with respect to r from 0 to co. We 
denote by dw an element of the unit sphere 2,; then an element 
of the sphere Q, of radius r will be represented by r°~1dq@. The 
expression (1) is transformed into 


w(o) = [{feserontoee o) de }rr- 1dr. 


Here w denotes direction from the origin of coordinates, or, if 
we like, a point of the unit sphere; p(r, w) is the value of the fune- 
tion v(x) at the intersection of the directed line w and the sphere 
Q,; 0 is |o|, and 0 is the angle between the vectors x and o. 

This formula acquires a particularly simple form when the func- 
tion m depends only on 1, i.e. is spherically symmetric. In this case 
g(r, ®) = g(r) and we have 


wo 


y(o) = f { fererersedeo } o(r) m-idr, (5) 


0 2, 


The enclosed integral can be completely determined. We con- 
sider first the case n = 3; then, taking the direction of the vector 0 
as polar axis, we have: 

dw = sin@ dé dea, 


MA. I5a 


446 MATHEMATICAL ANALYSIS 


where « is the polar angle in the plane orthogonal to the vector o: 
as a result we get 


2n 2” 
ei Oa: i 0a 20 6|" 
| eter cosP8sin dO da = eter cos9sin G dOtda = —— eter cos 
vor ( 
2, 0 0 
sinor 
oy ria 
or 


and hence 
p(o) = = fon rsingrdr. 
2 
0 


In the general case, with x arbitrary, the enclosed integral in (5) 
can be expressed in terms of Bessel functions} in the form 
ee | 1-2 
ferierers?den = (2x)? (or) *J, (or) (6) 
QR, @-i 
and therefore a complete representation of the Fourier transform 
of a spherically symmetric function is given by 


vor=(=) fom n_ en Par. a 


From formula (7) we can draw a somewhat startling conclusion: 
for » = 3 the Fourier transform of a spherically symmetric func- 
tion in n-dimensional space is a differentiable function (for g@ + 0), 
and furthermore the order of its differentiability increases to- 
gether with n. For a function g(r) is integrable over as n-dimen- 
sional space if the integral 


f lp@| m-tdr 


exists. In the integral (7) each formal differentiation with respect 
to @ increases the exponent of r in the integrand by one; since, 
for large r, |J,j2-1(@ 7)| SC r-1?, multiplication of the integrand 
by r is permissible so long as the total exponent of r does not ex- 
ceed n — 1/2, ie. at least [n — 1/2 — n/2] = [(n — 1)/2] times. 


+ Cf. for example, R. Courant and D. Hilbert, Methods of Mathematical 
Physics, Interscience, N.Y. (1953), Vol. 1, Chapter VII. 


THE FOURIER TRANSFORM 447 


Consequently the function (oc) has derivatives (with respect to 0) 
at least up to order [(m — 1)/2]. For n = 3 the function y(o) is 
differentiable at least once. 

Let us consider the Fourier integral inversion formula 


1 : 
9) =a | ~ flo ere do 
Rn 


and try to assign a meaning to it by first integrating over a sphere 
jo| = RB and then ree R tend to infinity. We put 


jo{ sR 
Substituting for y(o) from (1) and reversing the order of integration 
with respect to x and o (which we can do in virtue of the absolute 
and uniform convergence with respect to o of the n-tuple integral 
(1)), we get 


ma)=aoe [oe f eo dalae=—— foe tale as 
Rn Rn 


jol|sk 


We transform the enclosed integral to spherical coordinates in 
accordance with formula (6 C 


Hr( ier? do = fer ee #1eeosPdey | do 
o| SR 
R wea qa 
a (or) 7 Fn (er) de 
0 2 


1-RAR 

r relies J, (er)dg (r= |a— &)) 
0 a4 

Thus substituting t for 9 r, we get 


n Rrn 


Hp(é) = (22)? re fetd MOSS: 


One of the results of the theory of Bessel functions is 


ve 
rd 


df 
a ae In 


15a* 


448 MATHEMATICAL ANALYSIS 


hence 


nia n non 


Halé)=(2n)? F-" (Re2J, (Rr) =n)? [Ea] R27 (Rr). 
2 o 


For yp (x) we get the expression 


gnte) = 2x)? R? fp) [f — e as (R|g ~ xl) a 


= (ny? er re (8) 
2 


In the last integral we first integrate with respect to € over the 
sphere [&| = r. We put 


(7,2) == [pe tarde. (9) 
Q 


This quantity is the mean value of the function m on a sphere of 
radius r, centre x. The integral (8) now reduces to the form 


ey. 


Pr(t) = cr* [ 00, xv) roo (Rr) dr 


Let us put 
| @ (x,t) er for n even. 
f(r) = ao 
| @P(a,t)r2 2 for n odd; 


then, denoting the integer n/2 — 1 or n/2 — 1/2 by m, we get 


= aon» |i \a (z) dr, (10) 


where 
Jn(t) for n even, 


for ” odd. 


THE FOURIER TRANSFORM 449 


It is known that the function J,,2(t) is bounded as t > 0, and 
at infinity has the form 


n 
—-t 
L 

2 


where H(t) is absolutely integrable. It follows that the integral of 


J,j2(t) is defined and finite on (0, co) (and converges conditionally 
as tT > oo). We put 


1a) = aaa Ae), 


Fn) = [ Jn(a)ar. 


The asymptotic behaviour of the function Jhj9(x) as > oo is 
identical with that of Jn ae); For integrating by parts, we find that 
for any r > 0 
foe} i oo foe) 
elat eiar ret dr etax 
apse -| = = 1), 
i vt % tat" if tarrtt tau? 0 (16 ) 


ax x x 


which ensures that the second term is absolutely integrable as 
xo. Asx>0, Frye () is bounded in virtue of the convergence 
oO . 


of the integral f Jnj(x) dz. We can construct functions 
6 


Ji (a) = [Ih (x) dr, Tn (e Jn) )t) dt, «5 
n (a) = f Jn (2) -f ) 


they all possess the same asymptotic properties as J,,)2(x). The 
same applies. to the function J* (z). 

After these preparatory remarks we formulate a theorem. 
THEOREM (S. Bochner, 1932). If the function f(r) = 1” O(c, x) 
(cf. (9)) is bounded, continuous, and absolutely integrable (from 
0 to co) and together with its derivatives wp to order m is of bounded 
variation, then for all x 


p(x) = lim Oe lim f w(o)e@2 do. (11) 
R-roo (22)" R00 


Proof. Integrating (10) by parts, we get 


Jilf)vaeoes -i() 78] - ae 


450 MATHEMATICAL ANALYSIS 
Since the function f(t) has a zero of order not less than m at 


t = Oandis bounded as % -> oo, the first term on the right vanishes. 
Repeated integration by parts gives 


oo oo 
Tt t 
(—1)" Re [1() I&(c) dt = feo (=) T#™(t) dr. 
0 0 
We split the resulting integral into two parts: 

© N fos) 

fff 
0 0 N 


In virtue of general theorems on the limiting passage under the 
hee sign, we have as R -> oo 


fi fe (5 yan \de > i fe (0) 4 (1) dr = Om f™(0). 


The second integral admits the bound 


Lec zoe" ndr| < 0 frm Ge a een 
+ [1 () 0996) 


where the function H (r) is absolutely integrable, and p = 1/2 or 1. 
Since f(™ (x) is of bounded variation, we can write 


f(a) = A(x) — Bie), 


where A (x), B(x) are non-negative bounded non-decreasing func- 


tions. Then 
{ os) foe) 
Tt eat Tt ebat 
<= —— a ees 
c| Falalea|+|foGye« 
N N 


tN La p(k 1 6 
< — | —_——_—_—_. — | ———— +> 
= (se) -wrEwe Gabueer 


as N -> oo independently of the value of R. 


oO 


fi(3) 


THE FOURIER TRANSFORM 451 


It follows that 


Re if 1{F) Fle) dr > 7 0) Cn. 


Thus 
lim yp(x) = f(™ (0) CL, = p(z) Cn (12) 


for some constant O/,. But since the relation (12) holds, for exam- 
ple, for the function e~** with C/, = 1, the constant C}, is equal 
to unity for all g(x), and the theorem is proved. 

Note. The conditions of the theorem are satisfied, for example, 
if we suppose that the function y(x) itself has derivatives up to 
order m = [n/2 — 1], which are absolutely integrable over the 
whole space. This follows immediately from formula (9) if we con- 
sider that the integrals 


foe] 


1 ok p(x + wr) : 

pected (by (preset acess n-ld 
i | Dal ... Oaks do| r , 
0 

exist. 


Let us consider now the question of the summability of the 
n-tuple Fourier integral by the method of arithmetic means. We 
proceed from the expression 


Qe e@ 
1 _ 
Po(¥y» +2 Bn) = ay [foe vey On) ChB do 
= =e 


_ -if- fob. « an &) Eom sin eer &i) dé 


to the mean over the intervalO Se <R: 


1 
Orly, Lp) = spr | pele. sss) %p) do 
0 


1 1 
a + #)| | ——;——d&, ..., d&. 
Lb foes of| tay 


452 MATHEMATICAL ANALYSIS 


The Fejer kernel 


_ RR 
n sae tj 
@(R, t) _ I] t? 
jal I 


is the product of » one-dimensional Fejer kernels of the form 
(sin? Rt/2)t?. It possesses the following properties: 


(a) ®(R,t) = 0; 

(b) fi fO(R, ) dt =1; 

(c) f- [OR )dt>0 as R> x. 
lefze 


The last is equivalent to the limiting relation 


: t; 
sin? —_2 


l O(R, tat =TT f ay, 
a] 


ig; fa1 14 1<2, 
JH1, 2,22 Nn 


which holds since each of the one-dimensional kernels has the 
given property. 
Just as in the single variable case, it follows from properties 
(a)-(c) that 
pr(x) > plz), (13) 


if the function ¢(zx) is continuous. 

More generally: if p(x) = @(%,, ---, %) belongs to some normed 
space # Cc L,(f) and is continuous relative to displacement in 
that space, i.e. 


ee |p (ay + ty, tp + tp) — P(@y, -, Xp) = 9, 


then relation (13) holds in the norm of the space: 


p(x) — pr(x)|| > 0. 


As in the case of one variable, this last theorem reduces to a uni- 
queness theorem for the Fourier transform when applied to the 
space of all integrable functions: ¢f the Fourier transforms w,(o). 
We(o) of two integrable functions y1({x), Y2(x) coincide identically, 
then (x), (x) coincide almost everywhere. 


THE FOURIER TRANSFORM 453 


A class S (Section 3) can also be defined in the case of n inde- 
pendent variables; it is composed of the infinitely differentiable 
functions (x,, ..., %,) for which, for any k,, ..., kn, G1.) In> 


the quantities 
| OntUt-+4ng | 
plies Soe ee, 
1 nm a ag me gy xan 


are bounded over the whole space. Again as in the case of one 
variable this class carries into itself under Fourier transformation. 

Finally, the whole Z,-theorem of (Section 6) extends to the case 
of n-variables: if the function g(x) is square-summable over R,,, 
then its Fourier transform, defined by the formula 


N 


N 
y(o)=lim fi felejet@dex (17) 
N->o 


—N —N 


(the limit being taken in the metric of the space of square-sum- 
mable functions of o), exists and belongs to L,, and 


f+ fly@)Pdo = Gay f - flp@[P de. (18) 


There is also an analogue of the Wiener—Paley theorem, and of 
the Bochner—Khinchin theorem on the representation of positive— 
definite functions (Section 7). 


Problems. 1. Prove that the Fourier transform of the function 
re n 
_ Dy Dy jy %j Xp (19) 
z jalk=1 


where the exponent is the negation of a positive-definite quadratic form, is 
equal to 


= Dio) 
fg, (20) 
yD 
where 
}0 o, Op 
D = det |a;,|, D(o) = |0 1 (21) 


Hint. Reduce the quadratic form to canonical form by means of an ortho- 
gonal transformation. 


454 MATHEMATICAL ANALYSIS 


2. Find the Fourier transform of the function e~¢" (: = \z 2) . 


Answer: 


Concluding Remark 


Fourier series and integrals have their origin as a means of sol- 
ving problems in mathematical physics in the works of J. B. Fourier 
(French mathematician 1768-1830) and were systematised in his 
book The Analytic Theory of Heat (1822). The Laplace transform 
was applied by Euler (1737) to the solution of differential equations; 
P.S. Laplace (French mathematician, astronomer, and physicist 
1749-1827) developed it and made extensive use of it in his book 
The Analytic Theory of Probability (1812). The Fourier and La- 
place transforms have become one of the primary tools of mathe- 
matical physics of the nineteenth and twentieth centuries. The 
problem of representing arbitrary functions in Fourier series, 
which captured the attention of the foremost mathematicians of 
these centuries, did much to promote the emergence of a theory 
of functions of a real variable. The Lebesgue integral and the 
related equation F[L,(— o, o0)] = £,(— oe, co) rendered the 
Fourier transform indispensable in constructing the basic concepts 
of theoretical physics (quantum mechanics). The thirties saw im- 
portant new advances in the theory of the Fourier integral—in 
particular, the proofs of the Wiener—Paley (Section 6, art. 3) and 
Bochner—Khinchin (Section 7) theorems. In contemporary in- 
vestigations the technique of Fourier transformation is of increas- 
ing importance, and in particular, has enabled an approach to be 
made to the solution of fundamental problems in the general 
theory of linear partial differential equations with constant coef- 
ficients. Recommended literature: A. Zygmund. Trigonometric 
Series, Stechert, N.Y. (1935); E.C. Titchmarsh, Introduction to 
the Theory of Fourier Integrals, Oxford (1937); B. van der Pol and 
H. Bremmer, Operational Calculus, Cambridge (1950); D. TI. Blo- 
chintsev, Foundations of Quantum Mechanics, State Tech. Pub. 


THE FOURIER TRANSFORM 455 


Dept., (1949); I. M. Gelfand and G. E. Shilov, Generalised Func- 
tions, Nos. 1-3, Fizmatgiz (1958) (2nd ed., No. 1, 1960), I. M. Gel- 
fand and N. Ya. Vilenkin, Generalised Functions, No. 4, Fizmatgiz 
(1961) (being translated, Academic Press, New York). 


SUPPLEMENT 


]. FurtHER REMARKS ON SETS 


The ideas and propositions that have made up Chapter | of our 
book are sometimes referred to as ‘“‘Naive set theory”’. The fact 
is that even the early stages of the development of set theory 
revealed paradoxes and contradictions indicating that no advance 
is possible without clear axioms and rules of proof. In particular, 
many doubts have been expressed about the so-called “axiom of 
choice’’, or its equivalent, Zorn’s lemma (see below). This subject 
became pressing, inasmuch as the axiom of choice proved neces- 
sary for the development of the more advanced branches of analysis 
itself and of other branches of mathematics. As from the twenties, 
several systems of set theory have been proposed, with axioms 
and rules of proof that avoid the classical paradoxes; unfortuna- 
tely, self-consistency has not been proved for any of them. 
K. Gédel (1938) showed that association of the axiom of choice 
with one of them, the ‘‘Neumann—Bernays system”, cannot lead 
to inconsistencies provided the system itself (without the axiom 
of choice) is consistent. On the other hand, Spekker (1933) showed 
that the axiom of choice does not hold in another system, the 
“Quine system”, which admits of greater freedom in the treatment 
of sets. Our future constructions will relate to sets in the Neu- 
mann—Bernays sense; we shall not adduce here the relevant axioms 
and the reader must take our word that the constructions are 
carried out within the limits of the system. 

Definition. A set A is said to be partially ordered if a comparison 
relation < (less than or equal to) is defined for certain pairs of 
its elements (these pairs being described as comparable) with ful- 
filment of the following conditions: 


(1) « <a for any « € A; 

(2)if@ Sy,y Sa, then s = y; 

(3) ife Sy,y S2, thenw Sz. 
456 


SUPPLEMENT 457 


We can regard any set A as partially ordered if the comparison 
relation « S y is assumed to be defined only for y = xz. This case 
is trivial, and it seems natural to describe such a set as non- 
ordered. The opposite will be the case if any two elements x, y of 
A are comparable, so that the relations x <y or y <2 always 
hold. In this case A is simply said to be ordered, or linearly ordered. 


Examples. 1. The set of integers is linearly ordered with the 
usual definition of the sign <. The same is true for the set of real 
numbers. 


2. The set of points (%, y) of the plane is partially ordered if we 
assume that (71, ¥,) S (%., 2) provided y, = y,, 2, Xap. 


3. The system A, B, ... of subsets of a given set £ is partially 
ordered if A < B implies the inclusion A c B. 

Every subset of a partially ordered set is also partially ordered, 
with the same comparison relation. Generally speaking, linearly 
ordered subsets can be distinguished in a partially ordered set: 
for instance, the subset of points (#, y) on a horizontal line in 
example 2. A linearly ordered subset B of a partially ordered sub- 
set A is said to be maximal if no. wider linearly ordered subset 
B,> 8B exists. For example, in the partially ordered set of all 
points of a plane (example 2) the set of all points of any horizontal 
line is a maximal linearly ordered subset. 

The following axiom plays an essential part in what follows. 

Axiom (Zorn, 1935). Every partially ordered set contains a maximal 
linearly ordered subset. 

Let us introduce two important new concepts. 

Let B be a subset of a partially ordered set A. The set B is said 
to be bounded in A if an element b, € A exists such that b <b, 
for all 6 € B; every element b, € A satisfying this condition is 
described as an upper bound of the set B. An element x of a par- 
tially ordered set A is said to be maximal if there exists no element 
y == x greater than x. 

In the trivial partially ordered set mentioned above, every 
element 2 is maximal. At the same time, a partially ordered set 
may have no maximal element at all (like the set of integers). 

The following theorem gives a sufficient condition for the 
existence of maximal elements in a partially ordered set. 

THEOREM 1 (Zorn). If every linearly ordered subset B of a partially 
ordered set A is bounded, A contains a maximal element. 


458 MATHEMATICAL ANALYSIS 


Proof. Let By be a maximal linearly ordered subset of A; by 
Zorn’s axiom, such a subset exists. Let 6) be an upper bound of 
the set By. We claim that by is a maximal element. For, if there 
were to exist an element a, € A greater than by, the set By + ay 
would again be linearly ordered and larger than By, which con- 
tradicts the fact that By is maximal. 

The proof of a theorem of great importance in set theory, called 
the theorem of choice, is based on theorem 1. Briefly, this theorem 
asserts that, given a system of sets {A,}, a set Z exists which con- 
tains precisely one point of each of the sets A,. 

TuErorEM 2 (theorem of choice). Let there be a set A, for every 
index « of some set A. Then there exists a set of elements a, such that, 
whatever the fixed x of A, the element a, belongs to the set A,. 

Proof. We consider the set 11 of all functions x, that are defined 
on subsets of the set A and take values in the set A,. Every fixed 
element y, of A, determines one such function, which is defined 
only for one value of «. Thus Ul is non-empty. A comparison 
relation can be set up between the functions x, by taking % S y, 
if y, is defined in every case for the same values of « as x, (and 
possibly for other values of «), whilst y, = x, on the domain 
of definition of the function z,. We take any linearly ordered 
subset UW, > U. Let Ap be the union of all subsets of the set A 
on which the functions 2, € I are given. Certain of the functions 
2, € Wy are defined at every point a) € Ay, whilst the fact that 
they can be compared implies that they take the same value a, 
for « =a). This uniquely defined value defines a single-valued 
function y, on Ag. This latter can clearly be compared with all the 
functions 2, € Wy, and in fact x, S y,. We see that the linearly 
ordered subset U, is bounded in U1. By theorem 1, U1 contains a 
maximal element a,. We claim that the function a, is defined on 
the whole set A. For, if an element «) € A were not contained in 
its domain of definition, we could use any element y, € Ay, to 
widen its domain of definition by adding the value yy at the point 
X9, and a, would not be a maximal element of 11. Hence a, is the 
required function. 

Many constructions of analysis are based, explicitly or implicitly, 
on the theorem of choice. For example, the proof of the existence 
of a non-measurable set (p. 167) in fact requires the employment 
of this theorem. Applications of it can be traced in more elementary 
discussions. Take, for instance, the proof of the equivalence of 


SUPPLEMENT 459 


two familiar definitions of the continuity of a function f(x) at a 
point 2, of a metric space (the ‘“‘e—d-definition” and the “‘lim- 
definition”). The second definition is usually proved equivalent 
to the first by reductio ad absurdum: it is assumed that the first 
definition is not satisfied; given « > 0, we assign a sequence of 
values 6, -> 0 to 6 and choose points x, in the corresponding 
6,-neighbourhoods of the point x) such that |f (x9) — f(t,)| > «, 
and thus obtain a contradiction with the second definition. It will 
be seen use has been made of the possibility of an arbitrary choice 
during the argument. 

Long before Zorn, the theorem of choice was stated as an axiom 
by Zermelo (1904). Zermelo deduced from the ‘‘axiom of choice” 
a theorem—at first sight somewhat paradoxical—to the effect that 
every set can be “well ordered’. To see the meaning of this 
result we compare the order properties of three linearly ordered 
sets: the set A, of all real numbers, the set A, of all real non- 
negative numbers and the set A, of all natural numbers. The set 
A, does not possess a first (least) element. The set A, has a first 
element 0 but no immediately consecutive element. The set A, 
has a first element 1 and an immediately consecutive element 2, 
etc. 

The term “immediately consecutive” can be replaced by the 
more precise “least consecutive”. We say that a linearly ordered 
set A is well ordered if every non-empty subset C c A has a least 
element. We denote the least element of a well ordered set A by 1, 
the immediately consecutive element by 2, the next by 3, 4 etc. 
If the set A is not finite this process leads to the construction of 
a countable subset B = (1, 2, ...,”, ...). B may not coincide with 
the whole of A, in which case there is an element immediately 
following B, which we denote by w. We denote the element 
immediately following B +m by w + 1; the next elements are 
naturally denoted by w + 2,@ + 3, ... Theelementsw + w = 20, 
3a, ...,@? =@:q@ then make their appearance and so on. A 
canonical system of numbering the elements of a well ordered set 
is thus defined. But this numbering limits us to obtaining each 
time no more than a finite or countable subset, and it is difficult to 
imagine a non-countable well ordered set. We can now proceed 
to an effective statement of Zermelo’s theorem: 

A comparison relation < can be introduced into any set A in such 
a way that it is well ordered. 


460 MATHEMATICAL ANALYSIS 


Zermelo’s theorem may be deduced from Zorn’s as follows. 

Given any set A, let U be the system of subsets of A that can 
be well ordered. If a subset Bc A can be well ordered in several, 
say p, different ways, it appears p times in the system IU. This 
system is not empty, since it contains, for example, all single- 
element subsets of A. We introduce partial orderedness into the 
system U1 by assuming that B, < B, if B, < By, the order relation 
is the same on B, as on By, and all the elements of B, — B, are 
greater than any element of B, (so that B, is so to speak the 
origin of B,). We show that the condition of Zorn’s theorem is 
satisfied. Let U, > Ul be a linearly ordered system. By hypothesis, 
there is a unique well orderedness for all elements of a set of the 
system U,. This will also hold on the union B of all subsets of 
l,. The union B, together with the comparison relation holding 
on it, belongs to U,, so that U, is bounded. By Zorn’s theorem, 
U contains a maximal element A,. The element A, is a subset 
of A admitting of well orderedness. We claim that A, = A; for, 
if there were an element z€ A not belonging to A,, we could 
consider the subset Aj + 2 and regard z as following all the ele- 
ments of A,; but in this case A, would not be a maximal well 
ordered subset. Hence A, = A, and the theorem is proved. 


2. THEOREMS ON LINEAR FUNCTIONALS 


The following theorem enables homeomorphic linear functionals 
to be constructed in a linear space. Its “‘“complex” statement 
originates from Hahn (1927) (although Hahn himself only con- 
sidered real spaces), and the ‘‘real’’ statement (which imposes 
less conditions on the functional p(x)) from Banach (1929). 

THEOREM (Hahn—Banach). 

(a) let p(x) be a functional given in a real linear space E and 
satisfying the conditions 


pe +y) Spx) + ply), piAx) =Ap(x) for A20. (1) 
Further, let f(x) be a given linear functional in a subspace E, < E, 
satisfying the inequality 

f(x) S pz). (2) 


We claim that there exists a linear functional f*(x) in space E that 
coincides with f(x) on Ey and satisfies everywhere on E the ine- 


SUPPLEMENT 461 


quality 
f* (x) S p(2). (3) 


In other words, we can continue f(x) from Ey on to all of E whilst 
retaining inequality (2). 

(b) let p(x) be a functional given in a complex linear space E and 
satisfying the conditions 


ple +y) Spx) + ply), pax) = |A| p(x (1’) 


for any complex A. 
Further, let f(x) be a given linear functional on the subspace 
Ey Cc E, satisfying the inequality 


[f(x)| S ple). (2') 


We claim that there exists a linear functional {* (x) in EH, coinciding 
with f(x) on Ey and satisfying everywhere on EH the inequality 


If*@)| S ple). (3') 


In other words, the functional f(x) can be continued from Ey, to 
all of E whilst retaining inequality (2'). 

Proof. We denote elements of the subspace Hy by y. We show 
first that the functional f can be extended from EH, on to a sub- 
space #, with one more dimension. 

More precisely, what we do is this: we take any vector x) not 
belonging to Hy and show that it is possible to construct a linear 
functional f,(x), defined on the subspace £, of all linear combi- 
nations of vectors y € # and the vector x), in such a way that 
f,(%) coincides with f(x) on H, and is bounded on EF, by the same 
functional p(x) (in absolute value in the complex case). 

We first perform the construction for the real case. We define 
f/, in accordance with the natural formula 


fy(#) = Aly + Am) = fly) + AA), (4) 


where the number f, (#9) is defined in such a way that condition 
(2) is satisfied. Condition (2) can now be written as 


hy) =fYy) +Ah@) S ply + 4%). (5) 
When A > 0 we can write this as 
F(y/A) + fr(%) S ply/A + 2%) 


or 


f(y) + h(%) S PY +%), w= yA. (6) 


462 MATHEMATICAL ANALYSIS 


When 4 > 0 we put w = —A/ and obtain 

fy) ~ Hho) S PLY — #%), 
or, dividing by « and denoting yy = y/u, 

f(Y2) — fi@o) S PY2 — %o)- (7) 
Conversely, if the number /, (x) satisfies inequalities (6) and (7) 
for any y, € Ey, y2 € Ey, then (5) is also satisfied for any y € Hy 
and any 4. Inequalities (6), (7) can be written as 
f(Y2) — PYe — %) Sfi%o) SPY + Xo) — f(y)- 

We see that the solution of our problem depends on the relation- 
ship between the numbers 

a= sup {f(¥2) — Ps — %)} 


ye 
and 


B = inf (p(y + %) — f(y)}- 


yw Ey 


If « <8, the problem is soluble; if « > B, it could prove impos- 
sible to find a number /, (a) satisfying condition (2). Let us show 
that always, in fact, « < 8. We need to show that 


f(Y2) — P(Y2 — %) S P(Y. + %) — HY), 


for any y,, y2 of Hy, or what is the same thing, 


f(r) + F(Y2) S p(y + %) + P(Y2 — %)- (8) 
We show that (8) in fact holds. We have 
G1) + F(Y2) = FY. + Yo) S P(Ys + Ya) 

= P(Yy + Xp + Ya — %) SPY + Xp) + PYs — %); 


as required. Thus the required number /,(%) exists, so that the 
required continuation of the functional f is possible. 

In the complex case we resolve the functional f(y) into real and 
imaginary components: 


fly) =9(y) + thy). 


The functionals g(y) and h(y) are real and linear in HZ), regarded 
as a real space, and are bounded along with the functional f by 
the same functional p(x). We also have 


fiy) =tfYy) =g(ty) +ih@iy) =igly) — hy), 


SUPPLEMENT 463 


and h(y) = —g(iy), whence 
f(y) =9(y) —tg(ty). 


By what we have proved, the functional g(y) can be continued 
whilst retaining inequality (2') on to the real subspace H,,, of 
real linear combinations of vectors y € Hy and the vector 2%, 
and then, whilst still retaining the inequality, on to the subspace 
HE, > Hy, of real linear combinations of vectors y € Ey, 2%) and é ap. 
We put 

f(x) = g(x) — tg (2) 


on #,. We get an extension of the functional f from H, onto £,. 
Let us verify that it remains a linear functional in the complex 
space #,. It is sufficient to verify that f(¢ x) = 7 f(x). For 


[(ix) = g(ix) —tg(—2) 
= i[—tg(tax) — g(—2#)}] = lg (x) — tg(¢a)] = 7 f(x) 


It remains to show that inequality (3’) is satisfied for the func- 
tional f in H,. Given x), we choose a real number 6 such that 
e!9 f(z) is a real non-negative number. Then e'9 f(x) = f(e!® x) 
= g(e'® x) and consequently, 

f(x) | = let? fw)| = gle x) S ple x) = p(x), 
as required. 

Thus a given functional can always be extended—in the real 
or complex case—from a given subspace Hy £ to a larger space 
HE, whilst retaining inequality (3) or (3'). We show that an exten- 
sion of the functional f exists on to the whole space H whilst 
retaining the relevant inequality. We apply Zorn’s theorem of 
Section 1, and consider the system 1 of the subspaces of H on 
which the required extension is possible. It will be assumed that 
a given subspace H, appears as many times in 1 as there are 
possible methods of extending the functional f onto H,. The 
whole family 11 can be partially ordered by taking H, < Hy if 
FE, < Hy and the values of the functional f, extended on to FH, 
and Hg, coincide on H,. Let Wy be a linearly ordered subsystem 
of 11. The functional f is uniquely defined and satisfies inequality 
(3) or (3’) on the union #, of all subspaces #, belonging to 11. 
Consequently H,, itself belongs to U1, and obviously, is an upper 
bound for all the #,€U,. Hence UU, is bounded. By Zorn’s 
theorem, the system contains a maximal element H*. The linear 


464 MATHEMATICAL ANALYSIS 


functional f can be extended on to the subspace E* whilst retaining 
inequality (3) or (3’). Suppose #* == #; then f could be extended 
on to a larger subspace, which contradicts the fact that H#* is 
maximal. Hence #* = FE, and the theorem is proved. 

Some corollaries of the Hahn—Banach theorem may be men- 
tioned. 

1. The norm of the element x, or a multiple of it, can be taken 
as the functional p(x) (in both the real and the complex case). 
We now find, in particular, that a functional, satisfying the ine- 
quality 

[f(x)| S C]e| 


on the subspace Ey < E—and hence having a norm not exceeding C 
on E,—can be extended to all of E whilst preserving this norm. 
This is the most commonly encountered corollary of the Hahn— 
Banach theorem. 

2. There exists for every element x, = 0 of a normed space E a 
linear functional f, defined on all E, of norm 1, and such that 
f(%) + 0. For, putting f(A x) = Allxol, we get a linear functional 
with norm 1, defined on the one-dimensional subspace generated 
by the element 2). We continue it onto all # whilst preserving 
the norm and obtain the required functional. 

3. There exists for any closed subspace HZ, ++ # and element 
2X9 € Ey a linear functional f with norm 1, equal to 0 on Hy and 
such that f(a) + 0. 

For we can define the functional f on the subspace EZ, composed 
of all vectors of the form « = y + Ax, (y € Ey, A is any number) 
by means of the formula 


f(@) =ca, 
where c is a positive constant. We have on £): 
celal ¢ 
f sup sup 
Mek Ty taal veh [2,5 
0 
A 
c € 
| ? 
inf ce + 2p | 
yEEo 


where d = inf ||y/A + ao|| is the distance from the element 2, to 
yEEo 
the subspace Hy; d is positive, since EH, is closed. It may be seen 


SUPPLEMENT 465 


that, if we put c = d, a functional with norm 1 is obtained on F. 
On continuing it further onto all # whilst preserving the norm, 
the required functional is obtained. 

Note 1. An alternative statement of corollary 3 is: there exists 
for any subspace E, ¢ E and element x», not belonging to the closure 
of Ey, a linear functional with norm 1, equal to 0 on EH, and such 
that f (x9) + 0. 

Note 2. The functional p(x), figuring in the complex statement 
of the Hahn-Banach theorem, is “‘almost” a norm; it differs 
from the norm in that it can vanish, not only for «+0, but 
also on an entire manifold. We shall call such a functional a semi- 
norm. In the real case p(x) differs still further from the norm in 
its properties; it can even take negative values. 

Note 3. The Hahn-Banach theorem can be applied with great 
advantage in certain constructions of analysis. For example, when 
deducing the general form of a continuous linear functional in 
space C (a, b) (Chapter VI, Section 7, art. 1), we could have avoided 
the argument about continuing the functional onto a larger 
space (stage I) and applied the theorem directly instead. 

It is easily verified in this case, as in stage II, that the resulting 
function F(&) = f[y,,< (x)] is of bounded variation. But it will 
not in general be true that this function is continuous from the 
right, as is required for total additivity of the measure. Though 
this fact may not be important for a single variable, since the 
integrating function can be “improved” at discontinuities by 
making it continuous from the right (art. 2 Section 6), the possi- 
bility of such improvement is a much more complicated matter 
in the case of several variables. 

Let us mention some facts in connection with sequences of 
linear functionals. 

Theorem (Banach and Steinhaus, 1927). If the values of linear 
functionals f(x), ...,fn(x), .. are bounded on every fixed element 
a € H, they are uniformly bounded on the unit sphere of space E, 
in other words, the norms of functionals f,,(x) have a common bound. 

Proof. Tf the sequence of linear functionals f,,(2) is not bounded 
in the unit sphere || z|| < 1, itis evidently not bounded in any sphere 
| «|| <r; and further, it is not bounded in any sphere U (x), r) 
= {x:||¢ — x1 <1}, since if the numbers f,,(x), f,(%)) were 
bounded for # € U (xy, r), the numbers f,(% ~— Xp) = fn(#) — fn (Xp) 
would also be bounded, which is impossible, since « — ay runs 


466 MATHEMATICAL ANALYSIS 


over the sphere of radius 7, centre the origin. We note this and 
choose an element x, in the unit sphere, | x,|| <1, on which one 
of the functionals f,, say f,, exceeds 1 in absolute value: 


[fr(a)| > 1. 


Since f, is a continuous functional, there exists a sphere | # — 2,|] 
<r,, lying wholly inside the initial sphere | x|| < 1, in which the 
inequality 

lf(w)| > 1 
is satisfied. We find an element x, inside this sphere, and a func- 
tional f,, such that 

|fa(a2)| > 2, 


then choose a new sphere ||x — x,|| S7r,, lying in the previous 
one, at every point of which 


lfa(a)| > 2. 
On proceeding thus, we get a sequence of nested spheres of radii 
11, 1g, «+5 Tn» ++, tending to zero. The inequalities 
If %o)| > 1, [fa(%o)| > 2, --) [falte)| > m, -- 


hold at the common point of all these spheres (which exists by 
virtue of the completeness of space R and the lemma of Chapter II, 
Section 5), ie. the numbers f,,(% 9) are unbounded, which con- 
tradicts the hypothesis. 

CoroLLaRY. If a sequence of continuous linear functionals 
fila), -5 fx(z), --. ts convergent at every x € H, the limiting func- 
tional f(x) = lim f,(x) ts also linear and continuous. 

For, the linear properties of f(z) are got by a passage to the 
limit in the equation f,(a2 + By) =af,(x) + Bfn(y), whilst it 
follows from Banach’s theorem that the values of this functional 
are bounded in the unit sphere. 


Problems. 1. A linear functional f is given on a subspace Hy of a linear 
space # (without a norm). Show that it can be continued linearly onto all 
of £. 

Hint. Use the scheme of proof of the Hahn—Banach theorem (without 
bothering about the value of /(z))). 

2. Show that there exists a linear functional, defined everywhere in an 
infinite-dimensional normed space, for which 


sup |f(x)| = 0. 
ixjst 


SUPPLEMENT 467 


Hint. Modify the scheme of proof of the Hahn—Banach theorem in such 
a way that the norm of the functional is increased by one for every extension 
“by one dimension”. 

Start from an arbitrary functional, defined on a one-dimensional space; 
after a countable number of extensions by one dimension an unbounded 
functional is obtained. Use problem 1 for the further extension (to the 
whole of £). 


3. Prove that the following spaces of (complex) numerical sequences 
(with the natural linear operations) are complete: 
(a) the space cg of sequences tending to zero: 


= (Ey, fo, yr bys), lim, = 0, 
with the norm 
|||] = sup, |Eq|s 
(b) the space 1, of sequences 


CO 
OX = (Oy, Kay 1p My -)> Oe |X_] << , 
nel 
with the norm 


Jel = 3 logs 


(c) the space m of bounded numerical sequences 


© = (E15 b05 os Sas > SUP, 1én | < %, 
with the norm 


[||| = sup, [én]. 


Note. The space cy, is a closed subset of space m; Cy is separable, whilst 
m is not separable (see problem 14 of Chapter IT, Section 3), since it possesses 
a set of elements with the power of a continuum with distance 1 between 
any two elements (see theorem 3, Chapter I, Section 5). 


4. Find the general form (and the norm) of a continuous linear functional 
in space cy and in space J, (problem 8 of Section 9). 
Hint. Using the notation ¢, = (0, ...,0,1,0,...), any element of either 


n 
space can be developed into a series in e, , convergent in the norm of the 


space. Hence it is sufficient to find the numbers /(e,) and discuss their 
properties. 

Answer. A continuous linear functional f(x) is defined in space cy by an 
element « € 1,, so that 


fle) = Eo bu, Ul = 2 lol 


for x = (&,, ..., &, +.) €¢)- A continuous linear functional g(«) is given in 
space 1, by an element y = (7, .-, Mas --) € Mm, 80 that 


0 


g(x) = Tn ns [gi] = sup, [n,| 


for 0 = (Oy, +5 Oy, --) EL. 


468 MATHEMATICAL ANALYSIS 


Note. We might try to show that any functional in space m can be written 
in the form 


foe] 
f(x) = 2 Bn Sa (9) 
where £, is a fixed sequence of numbers and x = (é,, ..., &,, ..). This is 


not the case, however, We shall see in the next problem that there exists a 
linear functional {(x) in space m which maps every convergent sequence 
a= (6, ..,&,, .-) onto its limit lim &,. Such a functional cannot have 
the form (9). For, application of (9) to the element e, would give /(e,) = B, 


foe] 
= 0, so that f(z) = D 6, &, = 0; and this last is false, since e.g. f(e) = 1, 
nel 


where e = (1,1, ..., 1, ...). 

5. Construct on the space m of all bounded complex sequences zx = 
(f5, «+> &4) »-) with the norm {/a|] = sup, [&,| a linear functional with 
norm | that maps every sequence x onto a number é, lying in the least 
convex set that contains all the limit points of the sequence ¢, (and in 
particular, maps every convergent sequence onto its limit). 

Hint. Let m,C m denote the aggregate of real sequences. Define the 
seminorm p(x) = lim {&,| on m,. Take the real linear functional f, equal 
to 1 on the element e = (1,1, ..., 1, ...) and hence having the seminorm 1 
on the subspace {Ae}, and continue it on to all m, whilst preserving the 
seminorm. Use the relations | f(x) | S p(x), f(« — Ae) = f(x) — A to verify that 
the value of the functional f(z) lies between lim &, and lim &,. Put f(a+7¢y)= 
f(z) + ¢f(y) for the complex sequence x + iy (prove that the functional 
obtained is linear). Its value (a complex number) does not lie to the right 
of the vertical line through the right-hand limit point of the sequence &,. 
Obtain the required result by replacing x by e'®z and applying similar 
arguments. 


6. (continued). Having fixed an element « = (09, «.., 0, +) for which 
foe] 
a, 2 0, 8 = 1, we can use it to pass from a bounded sequence x = 
(fo. +» Es «-) to a new bounded sequence a* 2 = (1, «> %_, «-), Where 
oO 
Mn = 2% &4,(n = 1,2,...). We shall refer to this operation as an 
J =0 


o-operation. 

Show that all the limit points of the sequence «* x lie in the convex 
envelope of the set of all limit points of the sequence x. 

Hint. The number »,, is the “mean” of the numbers 944, Fr4n9 + 


7. (continued). Let the condition «,20 of problem 6 be replaced by 
fee] 
» |a;| = A < o (the «, can now be any complex numbers). Show that the 
g=0 


limit points of the sequence «* x lie in a circle concentric with the circle 
Q(z) containing all the limit points of the sequence x, and with a radius A 
times greater than the radius of Q(x). 

Hint. It is sufficient to take case when the centre of Q(x) is the coordinate 
origin. 


SUPPLEMENT 469 


Note. We shall call the number & the «-limit of a sequence x if & is the 
ordinary limit of the sequence «* zx. 

The result of problems 6-7 shows that, if a sequence x has the ordinary 
limit &, it also has an «-limit, equal to & for any choice of «; further, if the 
a-limit exists, it lies in the A-extension of the circle Q(x). In particular, 
p(a* x) <A p(z). 

8. (continued). Improve the structure of the functional f of problem 5 in 
such a way that every sequence with an «-limit (for a given fixed «) is mapped 
by the functional { onto the value of this limit. 

Hint. The subspace m, of sequences with zero «-limit does not contain 
the element ¢ in its closure with respect to the seminorm p(x), since A p(e — x) 
= p(a*(e — 2)) = p(a* e —a* x) = p(e) = 1. Put f(x) =0 on my, f(e)= 1 
and continue on to all m whilst preserving the seminorm. 

9. (continued). Improve the structure of the functional f in such a way 
that every sequence with an «-limit is mapped, at any rate for one «, by the 
functional / onto this limit. 

Hint. Tf x has an «-limit, and y a f-limit, x + y has a y-limit, where 


n 
Va = a oO; By 4. 
j=0 


The totality mm, of all elements x € m with a zero «-limit for some « is there- 
fore a subspace. 

Note. There exist sequences x € m that have no «-limit for any «; an 
example of such a sequence is (1, 0, ...,0,1,0, ...,0, 1,0, ...), where the 
intervals filled continuously by zeros increase indefinitely. 

10. (continued). A sequence is said to be quasizero if, given any « > 0, 


fe] 
there exists an element « = (ap, ..-,%q> ---), % 20, L'a; = 1, such that 
j=0 


pla* x) < e. 

Show that the functional {/ (problem 5) can be chosen so that f(x) = 0 
for any quasizero sequence, 

Hint. The quasizero sequences form a subspace, and e = (1, ..., 1, ...) 
does not belong to its closure with respect to the seminorm p(z). 


11. (continued). Every sequence of the form y* x, where 


co oo 
A> (Ho> ss Ens wo )s D2; |Hn| < 0, Oe bey = 0, 
n=O n=0 


is quasizero. 

Hint. Show that the proposition holds for pp = (1, —1, 0,0, ...) and 
displacements of it. Express an arbitrary quasizero sequence yz as a linear 
combination of 4g, displacements of zg, and a sequence z with p(z) < «. 

12. (continued). Show that the functional f formed in problem 10 has the 
property f(«* x) = f(x) for any «. 

Hint. By the result of problem 11, the sequence x — «* x is quasizero. 

Note. In particular, when « = (0,1, ...), the sequence «* x is a displace- 
ment of the sequence x; hence the value of the functional f(x) remains 
unchanged on displacement of the sequence z. 


MA. 16 


470 MATHEMATICAL ANALYSIS 


foe) 
13. If the numbers «,, ..., %,, ... are such that the series Y é,«, is con- 
kel 


co 
vergent for any x = (£,, ..., &, «-) €¢, then Y |x, | < «. 
n=1 


Hint. » follows from the hypothesis that the sequence of functionals 
t, (2) = Ps &, 0%, ON C9 is convergent for any element x €c,. Then use the 


HohiicBansch theorem and the general form of a linear functional on c, 
(problem 4). 

14. The 7Z-limit of a bounded sequence. Let T = (f,,), j= 1,2, ..5 
k= 1, 2, ..., be the infinite matrix satisfying: 


foe) 
(1) J |tj,| SC, C is independent of 7; 
k=l 


foe] 
(2) Yt, = 8, lim s,= 1; 
hel i> 00 
(3) lim ¢, = 0 for any k = 1, 2, ... 


7-> co 
(the Toeplitz matrix). Also, let «= (&, .., &, -..) be a bounded sequence 
oO 
We form the sequence 7'x of numbers ¢,(x) = 2 t,, &; if it exists, the limit 
kal 


of the sequence 7'x is said to be the 7-limit of the sequence x. Show that: 
(a) if the sequence x is convergent (in the usual sense) to the limit & the 
sequence 7'x is also convergent to the limit ¢; 


foe) 
(b) if T' = (,,) is any matrix with the property thatthe numbers ¢,(x) = 2 t,,é, 
k=l 


exist and are convergent to § for any convergent sequence z = {&,}, 
lim &, = & J’ must have properties (1)-(3). 
Hint. (a) It is sufficient to take the case & = 0. 
(b) Apply the matrix 7 to the oe (0, ..., 0, 1,, 0, ...) and obtain (3). 
Apply TZ to the element (1, ..., 1, ...) and obtain (2). The convergence of 
every series (1) is deduced from problem 13. The functionals ¢,(x) are con- 
vergent here for any # € cy; by the Banach-Steinhaus theorem, their norms 
have a common bound, which leads to (1). 


15. Show that the «-limit (problems 6-12) is a particular case of' the 7'- 
limit. 
Answer. If a, = (1, »--5%q, ---), the corresponding 7 matrix has the form 


16. Form the Toeplitz matrix 7’ such that 7 @=e+e for 
e= (1,1, ...,1,...),e= (1, -1, 1, —1, ...). 


SUPPLEMENT 471 


Answer. For example: 


Note. The existence of such a matrix T shows that there exists no linear 
functional f on the space m with the properties /(7' x) = f(x), f(e) =1; 
hence the result of problem 12, relating to «-limits, cannot be carried over 
to T-limits. 

17. Show that the closed subspace of space L,(— 0, o) that contains 
some function y(x) += 0, all its displacements and all its products with 
exponents e!%¢, coincides with the whole space L,(— ©, oo). 

Hint. Use corollary 3 of the Hahn—Banach theorem, the general form of 
a continuous linear functional on Z,, and the uniqueness theorem for Fourier 
transforms. 


MA. l6a 


INDEX 


Absolutely continuous functions 
304-308, 309, 310, 315, 317, 
329 

Absolutely monotonic functions 347 

Axuiezer, N.I. 104, 141, 281, 351 

Almost periodic functions 437 

Arbitrary kernels, non-homogeneous 
integral equations with 
250-261 

Arzela’s theorem 60, 380, 4382 

Axiom of choice 456, 459 


Banacu, 8. 177, 460, 465 

Banach space 64 

Banach-Steinhaus theorem 465, 
470 

Barn, N.C. 211 

Beppo Levi’s theorem 158, 162, 
164-165, 172, 175, 176, 182, 
184 

Bernouy, J. 101 

BERNSTEIN, F. 7 

Bernstein, 8.N, 347 

Bernstein’s theorem 7-9 

Bessel functions 446, 447 

Bessel’s inequality 203, 231, 232, 
270 

BiocuintsEv, D.I. 454 

Bocnuner, 8. 438, 449 

Bochner—Khinchin theorem 
438-442, 453, 454 

Bouzano, B. 283 

Bolzano—Weierstrass theorem 65 

Bonded function 231 

Bounded variation, function of See 
Functions 

Brachistochrone 102, 114 

BremMer, H. 454 

Bunyakovsgky, V. Ya. 193 


16a* 


Calculus of variations 23, 78-141 
aim of 78 
definition 78 
historical note 141 
Cantor, G. 13, 19, 20, 41, 48 
Cantor sets 40, 144-145, 303 
Cantor’s theory 48 
CARLEMANN, T. 277 
CARLEMAN, T. 413, 414 
Carleman’s condition 413, 417 
Carleman’s theorem 413-421 
Carstaw, H.8. 412 
Caucny 142 
Cauchy—Bunyakovsky inequality 


192, 193, 195, 211, 214, 
221-222, 232, 268, 275, 422, 
432 


Cauchy sequence 49 
Cauchy’s criterion 36, 37, 390 
Cauchy’s formula 405, 413, 419, 
427, 429, 432 
Cauchy’s theorem 387 
Characteristic functions 
determination of 233 
of Fredholm’s operator 230-232, 
234 
of measurable sets 165, 172, 317 
of operators 237, 245 
of Sturm—Liouville equation 238 
orthogonal system 240, 245 
Characteristic number 219 
Characteristic subspace 220 
Characteristic value 219, 225, 226, 
231, 232, 233, 240, 245, 272 
Characteristic vectors 219, 223, 
224, 225, 226, 232, 257, 271 
CHERKASSOV, A. 48 
Class C* functions 
definition 152 
Class Cj functions, integral for 321 


148-156, 180 


473 


474 


Closed set sphere 31 
Closed sphere 26, 39 
Compact spaces 55 
Completely continuous operators 
221, 224, 228, 229, 230, 272 
Contiguous open intervals 32 
Continuous abstract function 
373-377 
Continuous functions 
232, 262, 283 
definition 194 
Fourier development 371-373 
integral of 311 
of several variables 61 
on closed interval 82, 83 
problem on construction 310 
Continuous linear functionals 317, 
342, 465-467 
Continuous non-decreasing function 
317 
Continuous symmetric kernel 239 
Continuous symmetric operator 270 
Convergence in the mean 30, 38 
Convergence of Fourier series 
359-380 
Convergent sequences 28-35, 55 
Courant, R. 133, 234, 245, 446 


37, 52-62, 


Degenerate kernel 228~230, 233, 
252 
Degenerate operators 253, 257-259 
Densgoy, A. 413 
Derivative, determination of func- 
tion from 302-310 
of absolutely continuous function 
308 
of non-decreasing function 
283-294, 302 
Dino 108, 112 
Differentiable function 79, 282 
Differentiable functionals 79-88 
extrema of 88-93 
on normed linear spaces 89 
relative extrema of 88-89 
stationary point 89 
a 


type f f(x,y, y') dx 93-119 
b 


INDEX 


Differentiable functionals (cont.) 
conditional extrema 108-113 
with free end-points 113-119 

with higher derivatives 134-141 
with several independent variables 
127-134 
with several unknown functions 
119-127 
Differential equations 46, 261 
solution by Laplace transform 
405-412 
Differentiation 283-358 
Fourier transform and 393 
historical note 357 
of functions of sets 352-357 
of series of monotone functions 
291 
Dini point 379 
Dini’s condition 360, 363, 365, 
384-385 
Dini’s theorem 235 
Dirichlet’s exterior problem 264 to 
266 
Dirichlet’s formula 348 
Dirichlet’s function 154, 155 
Dirichlet’s interior problem 264 to 
265 

Dirichlet’s kernel 361, 366, 377 

Dirichlet’s problem 128-129, 264 
to 266 

Disc, map into right half-plane 350 

Du Bois—Reymond, lemma of 105 
to 108, 120 

Dyadic representation of real num- 
bers 14-16 


Keororr, D. F. 177, 178 

Egoroff’s theorem 178 

Energy integral 125 

Equicontinuity condition 440 

Equivalence 25 

Equivalent functions 162 

Euclidean space 189, 192, 200, 216, 
252, 253, 254, 255 

Ever, L. 141, 454 

Euler—-Ostrogradsky equation 128 
to 130, 134, 138, 139 


INDEX 


Kuler’s equation 95, 97-99, 100, 
108, 105, 112, 116, 118-120, 
124, 129, 133, 136 

Extrema of differentiable functionals 
88-93 

Extremum problems 78 


Factor-space 69 
Fatou 162 
Fatou’s lemma 162, 164, 165 
Fatou’s theorem 303 
Frser, L. 378 
Fejer’s kernel 377, 378, 388, 452 
Fejer’s theorem 392 
Fermat’s principle 107 
Fiscuer, KE. 163 
Fischer—Riesz theorem 163, 188 
Fixed point principle 43-48, 273 
Fomtn, C. V. 141 
Fovrtrr, J.B. 454 
Fourier coefficients 
207, 362 
Fourier integral 381 
arithmetic means of 388-392 
Fejer’s kernel for 388 
historical note 454 
Fourier—Lebesgue integral 437 
Fourier series 206, 380, 381 
convergence of 359-380 
development of functions in 359 
historical note 454 
Fourier-Stieltjes integral 436, 437 
Fourier-Stieltjes transform 436 to 
442 
Fourier transform 359-455 
and differentiation 393 
and resultant 394 
application to solution of thermal 
conductivity equation 396 
calculation of 385 
definition 381 
for several independent variables 
442-454 
historical note 454 
in Class L(—00, 00) 421-436 
in spherical coordinates 445 
inverse 381 


202, 203, 204, 


475 


Fourier transform (cont.) 
of square-summable function 426 
of summable function 392 
smoothness of 397-403 
uniqueness theorem for 392, 452, 
471 
Fraenxet, A.A. 20 
Fricuet, M. 77 
Freppoim, FE. 251, 280 
Fredholm alternative 260, 265, 
266, 271, 280 
Fredholm’s equation 280 
Fredholm’s formulae 277 
Fredholm’s functions 279 
Fredholm’s integral equation 264, 
266 
Fredholm’s integral operator 213, 
227, 231, 232, 236, 239, 240, 
249, 270, 271, 274, 275, 278 
characteristic functions of 230, 
234 
characteristic vectors of 220 
norm of 217 | 
with degenerate kernel 228 
with square-summable kernel 
228-230 
Fredholm’s theorem 251-254, 271 
FrrepMann, B. M. 280 
Fusin1, G. 181 
Fubini’s little theorem 291, 293, 
298, 299 
Fubini’s theorem 184, 214, 227, 
228, 229, 274, 394-395, 443, 
444 
Functional spaces 23 
Functions, determination from deri- 
vative 302-310 
of bounded variation 295~-302, 
306, 309, 310, 346, 436, 465 
of several variables 310-319 
Fundamental sequences 35-37, 49 
to 51, 163, 194 


Geurann,I.M. 72, 141, 455 
Generating function 324 

for several variables 340 
GuazMan,I.M. 281, 351 


476 


GoépeEL, K. 456 
Green’s function 244, 245 
Grusuin, V. V. 12 


Hann 460 

Hahn-Banach theorem 460471 

Haxtmos, P. R. 358 

Haminton, W. 123 

Hamilton’s functional 126, 130,139 

Hamilton’s variational principle 
123, 126 

Harmonic functions 261-264, 267 

Hausporrr, F. 20, 48, 57, 77 

Hetty, E. 334, 337 

Helly’s theorems 334-338, 349, 
439, 442 

Herew.otz, G. 351 

Hermite functions 200, 206, 403, 


432 
Hitpert, D. 133, 221, 234, 245, 
280, 446 


Hilbert-Schmidt condition 231 
Hilbert-Schmidt theorem 232 
Hilbert space 189-281 
applications to potential theory 
261-267 
complete 194, 201, 203, 211 
completeness criterion 204-205 
completion 195-196 
complex 267 
complex extension 269 
countable dimension 207 
definitions and examples 
197 
finite-dimensional 189 
historical note 280 
incomplete 195 
isomorphism of 191 
isomorphism of countable-dimen- 
sional 207 
length of vector and angle be- 
tween vectors in 191 
limiting process in 194 
linear functional 211 
linear operators 212-227 
minimal systems 211 
norm in 193 


189 to 


INDEX 


Hilbert space (cont.) 
one-dimensional 212 


orthogonal complements 209 to 
211 
orthogonal resolutions 197-212 


orthogonalisation 198-200, 209 
orthogonality of 197 
orthonormal systems in infinite- 
dimensional 201 
quadratically proximate systems 
211 
separability 208 
Hilbert’s theorem 226, 227, 230 
Hilbert-Tonelli theorem 104 
Holder’s inequality 186, 187 
Homeomorphic linear functionals 
460 
Homogeneous functional space 366 
to 369 


Identity operator 213 
Inclusions 2 
Infinite interval, case of 179-180 
Integral, concept of 142 
for Class Cj functions 321 
Fourier See Fourier integral 
Fourier-Lebesgue 437 
Fourier—Stieltjes 436, 437 
general concept of 282 
indefinite 282 
of summable function 298 to 
299 
Lebesgue See Lebesgue integral 
Lebesgue-Stieltjes See Lebes- 
gue-—Stieltjes integral 
Poisson’s 397 
Riemann See Riemann integral 
Riemann-Stieltjes See Rie- 
mann-Stieltjes integral 
Stieltjes See Stieltjes integral 
theory of 142-188 
case of infinite interval 
180 
case of several independent var- 
iables 181-188 
Class C* functions 
historical note 188 


179 to 


148-156 


INDEX 


Integral, concept of (cont.) 
Lebesgue’s 165-179 
measure of sets 165-179 
step function 148, 181 
summable functions 156-165 
Integral equations 23, 271, 278 
first kind 248 
Fredholm’s 264, 266 
non-homogeneous, with arbitrary 
kernels 250-261 
with symmetric kernels 246 to 
250 
second kind 248 
theory of, historical note 280 
with complex parameters 267 to 
280 
with degenerate kernels 252 
Integral operators with square- 
summable kernels 227-236 
Integral quadratic form 235 
Integrating function 325 
Integration 283-358 
by parts 310 
historical note 357 
Lebesgue, theory of 165-179 
of continuous abstract functions 
373 
of series with positive terms 
Inverse Fourier transform 381 
Inversion formula 399, 405, 407, 
438, 444, 447 
Isolated points 39 
Isomorphism 67 
between Euclidean spaces 200 
between Hilbert spaces 191 
of countable-dimensional Hilbert 
spaces 207 


158 


JAEGER, J.C. 412 
Jordan’s lemma 386, 410 


Kawntorovicy, L. V. 234 
Kuinecutn, A. Ya. xii, 384, 438 
Kotmogorov, A. N. xi, 363 
Korensium, B. J. 347 

Krain, M.G. xii 

Krinov, V. I. 234 


477 


LAGRANGE, J. 141 
Lagrange multipliers 111 
Lagrange’s function 123, 130, 134, 
139 
Lagrange system of equations of the 
second order 124 
Laguerre functions 200, 206, 403 
Lapiace, P.S. 454 
Laplace transform 403-412 
definition 404 
for differential equations 405-412 
for quasi-analytic classes of func- 
tions 412-421 
historical note 454 
LAVRENT’EV, M. A. 97 
Lepescusn, A. 161, 172-176, 
284, 299, 308, 357 
Lzesrscur, H. L. 188 
Lebesgue-integrable functions 156 
Lebesgue integral 175, 188, 191, 
326, 340, 353, 454 
Lebesgue integration, theory of 165 
to 179 
Lebesgue-measurable functions 
1738, 174 
Lebesgue-measurable sets 319 to 
329, 340 
Lebesgue measure 320 
Lebesgue points 300-302, 314, 379 
Lebesgue space 163, 180, 181 
Lebesgue’s theorem 284-289, 297 
Lebesgue’s theory of integral 181 
Lebesgue’s theory of measure 176 
to 177 
Lebesgue-—Stieltjes integral 322, 
323, 324, 340 
Legendre’s condition 96 
Legendre’s polynomials 
Leibnitz’ formula 399 
Limit point 30 
Linear functionals 
sequences of 465 
theorems on 460-471 
Linear functions on linear space 72 
Linear isometry 67 
Linear net 312 
Linear operator 214, 229 
norm of 215 


188, 


200, 206 


342 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 


OTHER TITLES IN THE SERIES 
IN PURE AND APPLIED MATHEMATICS 


WALLACE — An Introduction to Algebraic Topology 
PEDOE-—Cireles 

SPAIN— Analytical Conics 

MIKHLIN—TIntegral Equations 


EGGLESTON-— Problems in Euclidean Space: Application of 
Convexity 


6 WALLACE— Homology Theory on Algebraic Varieties 


7 NOBLE~— Methods Based on the Wiener—Hopf Technique for the 
Solution of Partial Differential Equations 


8 MIKUSINSKI-— Operational Calculus 
9 HEINE—Group Theory in Quantum Mechanics 
10 BLAND—The Theory of Linear Viscoelasticity 
ll KURTH—Avziomatics of Statistical Mechanics 
12 FUCHS-— Abelian Groups 
13 KURATOWSKI-— Introduction to Set Theory and Topology 
14 SPAIN— Analytical Quadrics 


15 HARTMAN and MIKUSINSKI-—Theory of Lebesgue Measure and 
Integration 


16 KULCZYCKI-WNon-Euclidean Geometry 

17 KURATOWSKI-— Introduction to Calculus 

18 GERONIMUS—Polynomials Orthogonal on a Circle and Interval 
19 ELSGOLC—Calculus of Variations 

20 ALEXITS—Convergence Problems of Orthogonal Series 


21 FUCHS and LEVIN—Functions of a Complex Variable, 
Volume II 


22 GOODSTEIN— Fundamental Concepts of Mathematics 
23 KEENE- Abstract Sets and Finite Ordinals 


24 DITKIN and PRUDNIKOV— Operational Calculus 
in Two Variables and its Applicationse 


25 VEKUA-—Generalized Analytic Functions 
26 FASS and AMIR-MOEZ— Elements of Linear Spaces 
27 GRADSHTEIN— Direct and Converse Theorems 
28 FUCHS—Partially Ordered Algebraic Systems 
483 


aoe WwW bdo = 


484 


Vol. 
Vol. 
Vol. 
Vol. 


Vol. 
Vol. 


Vol. 
Vol. 


Vol. 


Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 
Vol. 


Vol. 


Vol. 
Vol. 


Vol. 


Vol. 
Vol. 


Vol. 


Vol. 
Vol. 


Vol. 


Vol. 
Vol. 


Vol. 


Vol. 
Vol. 
Vol. 


29 


39 
40 


OTHER TITLES IN THE SERIES 


POSTNIKOV — Foundations of Galois Theory 

BERMANT—A Course of Mathematical Analysis, Part II 
LUKASIEWICZ— Elements of Mathematical Logic 

VULIKH— Introduction to Functional Analysis for Scientists and 
Technologists 

PEDOE— An Introduction to Projective Geometry 

TIMAN-~- Theory of Approximation of Functions of a Real Variable 
CSASZAR— Foundations of General Topology 

BRONSHTEIN and SEMENDYAYEV—A Guide-Book to Mathe- 
matics for Technologists and Engineers 

MOSTOWSKI and STARK— Introduction to Higher Algebra 
GODDARD-— Mathematical Techniques of Operational Research 
TIKHONOV and SAMARSKII— Kquations of Mathematical Physics 
McLEOD-— Introduction to Fluid Dynamics 

MOISIL—The Algebraic Theory of Switching Circuits 
OTTO—Nomography 

RANKIN —An Introduction to Mathematical Analysis 
BERMANT—A Course of Mathematical Analysis, Part I 
KRASNOSEL’SKII-— Topological Methods in the Theory of Nonlinear 
Integral Equations 

KANTOROVICH and AKILOV—Functional Analysis in Normed 
Spaces 

JONES—The Theory of Electromagnetism 

FEJES TOTH-— Regular Figures 

YANO—Differential Geometry on Complex and Almost Complex 
Spaces 

MIKHLIN— Variational Methods in Mathematical Physics 

FUCHS and SHABAT—Functions of a Complex Variable and Some 
of their Applications, Vol. 1 

BUDAK, SAMARSKII and TIKHONOV—A Collection of Prob- 
lems on Mathematical Physics 

GILES— Mathematical Foundations of Thermodynamics 


SAUL’ YEV—ZIntegration of Equations of Parabolic Type by the 
Method of Nets 

PONTRYAGIN eé al.—The Mathematical Theory of Optimal 
Processes 

SOBOLEV —FPartial Differential Equations of Mathematical Physics 


SMIRNOV—A Course of Higher Mathematics Vol. I 
SMIRNOV—A Course of Higher Mathematics Vol. II 
SMIRNOV-— A Course of Higher Mathematics Vol. III, Pt.1 
SMIRNOV—A Course of Higher Mathematics Vol. III, Pt. 2 
SMIRNOV—A Course of Higher Mathematics Vol. IV 


Vol. 
Vol. 
Vol. 


Vol. 
Vol. 
Vol. 
Vol. 


Vol. 


Vol. 
Vol. 
Vol. 


Vol. 


Vol. 
Vol. 
Vol. 


OTHER TITLES IN THE SERIES 485 


SMIRNOV—A Course of Higher Mathematics Vol. V 
NAIMARK~— Linear Representations of the Lorentz Group 


BERMAN —A Collection of Problems on a Course of Mathematical 
Analysis 


MESHCHERSKII—A Collection of Problems of Mechanics 
ARSCOTT—Periodic Differential Equations 
SANSONE and CONTI—Non-Linear Differential Equations 


VOLKOVYSKII, LUNTS and ARAMANOVICH—A Collection of 
Problems on Complex Analysis 


LYUSTERNIK and YANPOL’SKII—Mathematical Analysis, 
Functions, Limits, Series, Continued Fractions 


KUROSH—Lectures in General Algebra 
BASTON—Some Properties of Polyhedra in Euclidean Space 


FIKHTENGOL’TS—The Fundamentals of Mathematical Analysis, 
Vol. 1 


FIKHTENGOL’TS—The Fundamentals of Mathematical Analysis, 
Vol. 2 


PREISENDORFER-— Radiative Transf-r on Discrete Spaces 
FADDEYEV and SOMINSKII—Elementary Algebra 


LYUSTERNIK, CHERVONENKIS and YAN’POLSKII—Hand- 
book for Computing Elementary Functions 


